The other day I was writing a script in Ruby that needed to persist data. I wanted the script to be as self-contained as possible, so I didn't want to use a database and I didn't want to write to an external file. I wanted just one file with both the script and the persisted data.

Ruby has a special character sequence __END__, and anything after this sequence will not get executed. What is handy about __END__ is that you can get a file-handle pointing to this location, or rather just after this location, using the DATA constant. This got me half-way to solving my problem. Reading data from the DATA constant is easy, but writing to it isn't as simple.

The trick is to ask DATA for its position in the file with DATA.pos. Whatever DATA.pos returns needs to be saved to a variable and used to seek back to this position when writing to the file. For my script I wanted to persist key-value data. The following code treats the DATA as a key-value data store, and can come in handy. A word of caution however, if something were to go wrong you could overwrite your script by mistake. There is one caveat; you need to make sure you put a signal line return after __END__.

Github gist

module RubySourceDataStore
  class DataStore
    DEFAULT_OPTIONS = {
      auto_save: true,
      serializer_name: :YAML
    }

    def initialize(data_file_handler, options = {})
      @data = data_file_handler
      @data_pos = data_file_handler.pos

      DEFAULT_OPTIONS.merge(options).each_pair do |key, value|
        self.instance_variable_set("@#{key}", value) if DEFAULT_OPTIONS.keys.include?(key)
      end
    end

    def get(key)
      data_store[key]
    end

    def set(key, value)
      @data_store[key] = value
      save if auto_save?
    end

    def save
      File.open(@data, 'r+') do |file|
        file.seek(@data_pos, IO::SEEK_SET)
        file.write(serializer.dump(data_store))
      end
    end

    private
    def serializer
      @serializer ||= Kernel.const_get(@serializer_name)
    end

    def auto_save?
      true if @auto_save
    end

    def data_store
      @data_store ||= File.open(@data, 'r') do |file|
        file.seek(@data_pos, IO::SEEK_SET)
        serializer.load(file.read)
      end
    end
  end
end

Here is a quick little example of how you could use this.

# make a new instance of DataStore, passing it the DATA constant
# you can also pass optional arguments auto_save and serializer_name
# auto_save         Lets you specify if the data store should automatically
#                   save after updating the values in the data store.
#                   By default it is set to true.
#                   
# serializer_name   Lets you specify the name of a serializer to use.
#                   By default it will use YAML.
data_store = RubySourceDataStore::DataStore.new(DATA)

# Fetch a value form the data store
data_store.get(:tasks)

# Set a value in the data store
data_store.set(:tasks, new_tasks_list)