Refinements, scope, enumerables oh my!
Refinements, scope, enumerables oh my!
problem
A while ago I tried to extend ruby's Enumerable class with a new method using a refinement. The method followed the standard Enumerable pattern to support both external and internal iteration, e.g.:
refine Enumerable do
return to_enum(__method__, factor) unless block_given?
each { yield it * factor }
end
end
end
Internal iteration worked just as I expected:
using Multiply
(1..10).multiply(3) { puts it }
# 3
# 6
# 9
# ... etc.
Unfortunately, when I tried to use this for external iteration or to chain with other enumerable methods you get a surprising error:
using Multiply
tripled = (1..10).multiply(3)
tripled.next
# 'Enumerator#each': undefined method 'multiply' for an instance of Range (NoMethodError)
(1..10).multiply(3).to_a
# 'Enumerator#each': undefined method 'multiply' for an instance of Range (NoMethodError)
What was going on here? How could I fix it?
solution
As I found out the basic solution is to return an explicitly initialized Enumerator object when no block argument is provided and to run your desired logic in this enumerator's block argument.
We'll build up to this, but I wound up with something like this:
= enum.each { yield it * factor }
refine Enumerable do
return Multiply.logic(self, factor, &block) if block
Enumerator.new(self.size) do
Multiply.logic(self, factor) { yielder << it }
end
end
end
end
This test suite demonstrates that the above works:
describe Multiply do
describe do
it do
using Multiply
results = []
[1, 2, 3].multiply(3) { results << it }
results
end
end
results = Tester.test
_(results).must_equal([3, 6, 9])
end
end
describe do
it do
using Multiply
enum = [1, 2, 3].multiply(3)
result = []
enum.each { result << it }
result
end
end
result = Tester.test
_(result).must_equal([3, 6, 9])
end
it do
using Multiply
enum = [1, 2, 3].multiply(3)
result = []
until (val = enum.next rescue nil).nil?
result << val
end
result
end
end
result = Tester.test
_(result).must_equal([3, 6, 9])
end
end
end
explanation of the problem
Basically, the problematic code from the start of this post runs afoul of the following part from the refinement documentation:
When control is transferred outside the scope, the refinement is deactivated. This means that if you require or load a file or call a method that is defined outside the current scope the refinement will be deactivated
To illustrate this, let's write our own version of Object#to_enum (which is implemented in C in ruby):
include Enumerable
@enumerator = Enumerator.new do
obj.send(method, *args, **kwargs) { yielder << it }
end
end
@enumerator.next
end
end
We'll use it like so:
refine Enumerable do
return SendEnumerator.new(self, __method__, factor) unless block_given?
each { yield it * factor }
end
end
end
This behaves the same way as the initial code:
describe Multiply do
it do
using Multiply
result = []
(1..3).multiply(3) { result << it }
result
end
end
_(Tester.test).must_equal([3, 6, 9])
end
it do
using Multiply
enum = (1..3).multiply(3)
enum.next
end
end
_{ Tester.test }.must_raise(NoMethodError)
end
end
This makes the issue pretty clear: to_enum is effectively returning an object which is defined outside of the lexical scope where the refinement is enabled. As a result that object does not have access to the extension method.
"Okay," I thought, "but the SendEnumerator is an Enumerable and is constructed in a spot where the refinement is enabled." That's true, but refinements are not attached to objects themselves. They're attached (enabled in the documentation's language) to lexical scopes. If we could edit the source code of the object returned by to_enum to have using Multiply before it's #next implementation then we would not have a problem, E.g.:
include Enumerable
using Multiply # CRUCIAL!!!
@enumerator = Enumerator.new do
obj.send(method, *args, **kwargs) { yielder << it }
end
end
@enumerator.next
end
end
describe Multiply do
it do
using Multiply
enum = (1..3).multiply(3)
enum.next
end
end
_(Tester.test).must_equal(3)
end
end
Of course, we can't (and wouldn't want to) edit every piece of core source code to use our extension. That would negate the point of refinements!
But why does the explicit Enumerator with block solution work then? We're returning Enumerator objects in both cases! This is because blocks capture their lexical scope and so the block passed to Enumerator captures the active refinement. In a repl you can actually see this, e.g.:
end
refine String do
end
end
using F
end
puts A.f1.call.inspect # => #<Method: String#f() capture.rb:2>
puts A.f2.call.inspect # => #<Method: String(#<refinement:String@F>)#f() capture.rb:7>
building up to my solution
Now that I understood what was going pretty well I could take a crack at a solution. This section is optional. Feel free to skip it if you don't care to follow my thought process to arrive at the solution presented at the beginning.
Throughout I'm going to be using the following test suite:
describe Multiply do
it do
using Multiply
enum = (1..3).multiply(3)
enum.next
end
end
_(Test.run).must_equal(3)
end
it do
using Multiply
.multiply(3).to_a
end
_(Test.run).must_equal([3, 6, 9])
end
end
reusable enumerator
Since we need to return an Enumerator in the case where no block is provided and since an Enumerator is also Enumerable it made sense to me to go ahead and consolidate the logic into an Enumerator first and use it slightly differently depending on the presence of a block:
refine Enumerable do
enum = Enumerator.new(size) do
each { yielder << it * factor }
end
return enum unless block
enum.each(factor, &block)
end
end
end
# Running:
# ..
# Finished in 0.000900s, 2222.2223 runs/s, 2222.2223 assertions/s.
# 2 runs, 2 assertions, 0 failures, 0 errors, 0 skips
That worked! On the right track. The main downside with this is that I'm always allocating an Enumerator object even when I might not need to. If a block is given I shouldn't need to create an Enumerator; I should be able to do internal iteration directly.
reusable method
This iteration moves the core logic into a private helper method. It's named with an underscore prefix in a basic attempt to avoid name collisions:
refine Enumerable do
if block_given?
_multiply(factor)
else
Enumerator.new(size) do
_multiply(factor) { yielder << it }
end
end
end
private
each { yield factor * it }
end
end
end
# Running:
# ..
# Finished in 0.000911s, 2195.3898 runs/s, 2195.3898 assertions/s.
# 2 runs, 2 assertions, 0 failures, 0 errors, 0 skips
I don't like that this pollutes the namespace of Enumerable with the helper! It's a small thing, but I'd prefer if the refinement is really as focused as it can be.
logic in module
This is the solution the post started with! It moves the real logic into a class method on the refining module. The method added by the refinement then becomes something of an adapter that makes the logic available on Enumerable.
= enum.each { yield factor * it }
refine Enumerable do
if block
Multiply.logic(self, factor, &block)
else
Enumerator.new(size) do
Multiply.logic(self, factor) { yielder << it }
end
end
end
end
end
# Running:
# ..
# Finished in 0.000810s, 2469.1358 runs/s, 2469.1358 assertions/s.
# 2 runs, 2 assertions, 0 failures, 0 errors, 0 skips
recap
To sum up: Refinements are tied to lexical scopes, not objects themselves so if you need behavior tied to a refinement to "escape" beyond their lexical scope you'll need to use blocks to capture that lexical scope. For Enumerables specifically this means you will likely want to use a pattern where the logic itself lives outside of the refinement and is invoked in an Enumerator block or yielded inside a call to #each depending on whether block_given?.