Solving Update Races in Balrog: The Plan

The coding period for the Google Summer of Code has begun, so here is the plan for the project, as was promised.

Currently, when two submitter tasks request Balrog for a blob to update at the same time, they both have the same data_version in the blob they send back with the added locale. This leads to the server rejecting one of these and the submitter having to retry. In most cases, the updates can simply be merged, preventing the retries. For example, this series of events is something like what happens now:

  1. Submitter 1 requests data from Balrog (and receives data specifying data_version = 1)
  2. Submitter 2 requests data from Balrog (data_version = 1)
  3. Submitter 1 submits release blob to Balrog with data_version (data_version changes to 2)
  4. Submitter 2 fails to submit release blob to Balrog (since data_version is now 2, but submitter specified 1 in the request)

At most times, the data in the blobs is similar apart from some added data such as added locales. So, we can try to reduce submission failures by devising a way to merge the blob versions when we receive a request with an outdated data_version. As my project, I will seek to implement the merging of two blobs. This will be with the utilization of a three-way merge algorithm, something similar to what git uses for merges. Since we currently do not have any libraries for this task, I will make a module for three-way merges of python data structures (and hopefully get it published on PyPI!) and use that within Balrog to accomplish the goals of this project.

The basic algorithm for three-way merges is described here:

  1. We first calculate a diff between the new versions of the file (data_version 2 and the version where we fail, in our example) and the old version (data_version 1). There are multiple tools that allow diffs of the form we want. DictDiffer and deepdiff are two alternatives we could utilize for this step. Both of them return an easy to use list of differences in our blobs.
  2. If there are no changes for that particular item in either diff or if both diffs have added an element with equal values, we add the item in the resultant object.
  3. If both diffs have removed some item, we ignore that item and do not include it in our output.
  4. If one diff has added an element while the other hasn't, we add the element.
  5. If both diffs have added or changed an element but they have unequal values, we recursively apply our algorithm. If there is a difference in a string that was present in the the root version (data_version 1 in our example), we might apply a string three way merge algorithm.
  6. For lists, we can treat every list element as a line in text and apply a traditional three way merge algorithm to it.
  7. For all other cases, we may consider the two changes to be conflicting and we may apply some conflict resolution strategies.
  8. If the type of an element changes, we can consider it to be a merge conflict.

Further analysis needs to be done for the handling of list and tuple values, since the preservation of order might be important in those data structures. We might want to support two modes: one where the preservation of the order of the list elements is important and another where it isn't. In some cases, merging strings might also be undesireable, so even might need to be made optional.

We may employ several merge-conflict resolution strategies:

  • Select one of the changes.
  • Discard both changes

This algorithm is based on this research paper and this email. Both show how a three way merge would work out for strings.

Here is a link to my proposal. Keep checking this space every week for updates on my project and feel free to point out if I can do anything better! Thank you :)

Multifile Responses in Balrog

Apart from Firefox, Balrog is also used by Mozilla to provide updates for the Gecko Media Plugin (GMP) package. The Gecko Media Plugin package contains various plugins for media support, like the OpenH264 codec and the Widevine plugin. To handle updates to these, we have a speical GMP blob that lists updates to the plugins. Updates to every plugin are included in one blob. This leads to problems when there are multiple versions of a plugin that we can use. For example, we might serve OpenH264 version 1.5.3 on Firefox 42 on Windows and version 1.5.2 on Firefox 40. We have to maintain a blob for each possible combination of versions. With an increase in the number of versions available, this method of serving updates might become intractable.

Ben and I discussed various strategies we could use to tackle this. A gist of what went through our minds is on the bug page. What we eventually went ahead with was to add a blob type that got its contents from other blobs. We call this blob type SuperBlob.

SuperBlob

A SuperBlob is basically just a redirection mechanism. It just contains the names of the products that we wish to include in the generated XML.:

 {
    "name": "fake",
    "schema_version": 4000,
    "products": [
        "c",
        "d"
    ]
}

This superblob is called fake. It just asks to look at the products named c and d. A rule can be set to point at this SuperBlob if we wish the response product to have all the files listed in products c and d.

How does this work?

The web interface, while processing requests, checks if the rule evaluates to a SuperBlob. If it is, it gets the product names from the SuperBlob and gets the corresponding blobs by evaluating the query with the requested product changed. So, in our example, the web interface will evaluate the query with the product name changed to c and d and obtain the resultant blobs. It will pick up the header and footer XML from the output of processing the blob obtained from the first product and the inner XML will include the concatenated inner XMLs of all the blobs obtained from the products listed in the SuperBlob.

So, if product a gave:

<updates>
    <update type="minor" version="None" extensionVersion="2.5" buildID="25">
        <patch type="complete" URL="http://a.com/b" hashFunction="sha512" hashValue="23" size="27777777"/>
    </update>
</updates>

and product b gave:

<updates>
    <update type="minor" version="None" extensionVersion="2.5" buildID="25">
        <patch type="complete" URL="http://a.com/public" hashFunction="sha512" hashValue="23" size="22"/>
    </update>
</updates>

the SuperBlob will give:

<updates>
    <update type="minor" version="None" extensionVersion="2.5" buildID="25">
        <patch type="complete" URL="http://a.com/public" hashFunction="sha512" hashValue="23" size="22"/>
        <patch type="complete" URL="http://a.com/b" hashFunction="sha512" hashValue="23" size="27777777"/>
    </update>
</updates>

So, we can now have one fixed rule for all GMP responses and have several rules for each constituent plugin without having to worry about the combinations like we had to do earlier.

While working on this, I also removed the createXML method in favour of three methods that return the header, the inner XML and the footer respectively. This helped in seperating the various components of the XML output without having to parse it. The XML generation logic has moved to the client view.

Guess who got into Google Summer of Code?

I'm really excited to say that I'll be participating in Google Summer of Code this year, with Mozilla! I'm going to be working on the Balrog update server this summer, under Ben's guidance. Thank you Mozilla and Google for giving me this chance!

I'll be optimizing Balrog by devising a mechanism to handle update races. A future blog post will describe how exactly these races occur and how I aim to resolve them. Basically, an algorithm similar to what git uses for 3-way merges is required, but we also need to take care of nesting since the 'Blob' data structure, that we use, has nested data. I will also share my proposal and the timeline I'll be following in the coming weeks.

These three months are going to be amazing and I'm really looking forward to working with the Mozilla RelEng community! I will be blogging weekly once the coding period commences, and I welcome any suggestions that might lead to me presenting a better final product.

External Plugins: Alternatives

  • Embedding a Python Interpreter:

The Python Docs show how this can be done. We can use Boost.Python to access a higher-level API. We'd just have to scan the contents of some predefined directory, where each Python script would be in some specific format (with functions like mimetypes, write or extract), and make a C++ shim that would generate a WriterPlugin or ExtractorPlugin for each script.

The main limitation to this approach would be that it would involve a lot of effort if I needed to support multiple languages, since I would need to embed all of their interpreters.

  • Creating a process for each plugin:

Every plugin would include a manifest that would detail how it is to be used. For example, a Java plugin would have to be run with the JVM, a Python one with CPython etc. The manifest would also have other information like supported mimetypes, properties write-back capabilities, etc. Whenever a WriterPlugin or an ExtractorPlugin is needed for a particular mimetype, all installed plugins are checked for support. Writing or reading metadata using a new plugin will invoke a new process using the instructions specified in the manifest. There are several ways we could use to communicate with the process:

1. Using DBus: Would allow for more structed communication between the plugin and the framework. It would also decouple the plugins. But it might be overkill since limited communication is needed between the plugin and the framework.

2. Using named pipes: This allows for decoupling too, but again, might be overkill since we aren't transferring large amounts of data.

3. Using stdin/stdout: This is the method Boudhayan Gupta has used in his patch. We need to carefully error conditions.

The data is exchanged in the JSON format.

We have decided to create a new process for each plugin and use stdin/stdout for communication, since it is the most straighforward way. If you think that there are other ways, or there is a discernible benefit to using any of the other methods I've listed, please let me know in the comments.

Things to keep in mind (mostly suggested by vhanda on Boudhayan Gupta's patch):

  1. There needs to be a proper error-handling framework, to make sure that an invalid plugin does not block execution for long periods.

  2. Dependencies need to be handled, both for languages and language libraries. How I'm planning to do this with Python will be covered in a future post.

  3. There needs to be a way to select from amongst several plugins offering support for the same mimetypes.

So that's about it for this post. I'm currently working on this, so you should expect to see some code soon. Thanks, and as always, please feel free to suggest any improvements I could make to my approach!

Write Back Plugins: What's the Best Way?

I have mainly been working on the write-back plugin mechanism that we needed to incorporate into KFileMetaData. Most of my changes are visible at my fork.

The work yet has involved a couple of decisions:

  • Deciding the structure of the framework:

While writing the code for write-back plugins, I noticed lots of commonalities amongst what I was writing and extractors. I thought of abstracting out these common elements into a generic Plugin class, that would optionally support both reading writing data. Alas, this could not be done. Doing this would involve changing the ABI of KFileMetaData, breaking compatibility with other tools that use the current version. So, after consulting with my mentor, I decided to rewrite the write-back plugin infrastructure from scratch. Developers working on both writer and extractor plugins will have to find a way to share data outside of the infrastructure we will provide.

  • Deciding what writers to write first:

Some of the libraries that we currently use for extraction don't support writing back yet. Poppler is one case. Though they are working on adding support for writing back PDF metadata, it's not ready yet, and I'll have to defer writing a plugin for PDF metadata (or else implement external writer support first, as BaloneyGeek on #kde-devel suggested). I have started with writing a rudimentary TagLib writer, and I'm going to progress into writing a writer for epub books as well.

What I have learned yet:

  • CMake: Though I have struggled a lot with it, my efforts are starting to bear fruit. I finally feel comfortable making my own CMakeLists.txt.

  • Qt: I'm learning a lot about all the functionalities Qt offers, and I try to integrate whatever I can from what I learn into the code. My mentor's suggestions have also taught me a lot about some less obvious parts.

  • C++: Exploring the code base, I chanced upon a lot of new patterns, like the PIMPL, object reuse etc. I try my best to know why what is done, so I avoid the pitfalls that the creators of these patterns faced.

That's enough about what I've learned. I'll now outline some of the changes you'll see on my fork that you just might see in the final framework:

  • A new WriterPlugin class:

This class will be subclassed by all plugins that hope to write data to some file. The most important functions are write and writeMimetypes. write is where the action is. WriterPlugins will implement this function to actually write the metadata to disk. writeMimetypes simply lists the mimetypes supported by the plugin. A supportedProperties function is also something that I plan to add, to make sure that applications do not attempt to write an unsuitable proptery datum to some file. This is sort of analogous to ExtractorPlugin.

  • A Writer framework:

I have put in analogues to most of the functionalities that Extractors offer. Instead of using something similar to ExtractionResult that needed to be subclassed, I have used WriteData, which is concrete. The rationale behind this is that we won't need to store huge amounts of data to write back, since we don't support writing back text.

  • A TagLibWriter class:

This shows how the new Writer classes can be used. It is really rudimentary at the moment, but I plan to extend it to support writing back all the properties that can be read by TagLibExtractor.

The road ahead: This is just the beginning of what I have planned for the Season of KDE. I plan to work on Boudhayan Gupta's patch, change it so that it can be used for Writers and implement the suggestions put forth by vHanda to improve it.

It's been an exciting journey so far, and it's going to get even better! And if you have any suggestions for me to improve upon, please do let me know of them via comments. Thank you!

Signing off, Varun

My project was selected for the Season of KDE!

So this is where I'll be documenting my progress throughout the Season of KDE, as I had promised in my proposal. This will help me look back and improve on mistakes that I might have made earlier. I will post updates whenever I make any significant progress.

Please feel welcome to make suggestions, I'd love to know how I could do things better! Awesome things are (hopefully) coming!