https://vlkan.com// Volkan Yazıcı's Soap Co. 2018-11-14T07:54:00Z Volkan Yazıcı https://vlkan.com/ tag:vlkan.com,2018-11-14://blog/post/2018/11/14/elasticsearch-primary-data-store/ Using Elasticsearch as the Primary Data Store 2018-11-14T07:54:00Z 2018-11-14T07:54:00Z <p>The biggest e-commerce company in the Netherlands and Belgium, <a href="https://bol.com">bol.com</a>, set out on a 4 year journey to rethink and rebuild their entire <a href="https://en.wikipedia.org/wiki/Extract,_transform,_load">ETL (Extract, Transform, Load)</a> pipeline, that has been cooking up the data used by its search engine since the dawn of time. This more than a decade old white-bearded giant, breathing in the dungeons of shady Oracle PL/SQL hacks, was in a state of decay, causing ever increasing hiccups on production. A rewrite was inevitable. After drafting many blueprints, we went for a Java service backed by <strong>Elasticsearch as the primary storage!</strong> This idea brought shivers to even the most senior Elasticsearch consultants hired, so to ease your mind I’ll walk you through why we took such a radical approach and how we managed to escape our legacy.</p> <p>Before diving into the details, let me share a 2,000ft overview of an e-commerce search setup that will help you to gain a better understanding of the subjects discussed onwards. Note that this simplification totally omits a nebula of incorporated caching layers, systems orchestrating multiple search clusters, queues with custom flush and replay functionalities, in-place resiliency mechanisms, services maintaining deprecated search entities to avoid getting ranked down by bots due to 404s, circuit breakers, throttlers, load balancers, etc. But it is still accurate enough to convey the general idea.</p> <p><img src="overview.jpg" alt="Architecture Overview"></p> <h1 id="table-of-contents">Table of Contents</h1> <ul> <li> <a href="#search">The Search</a> <ul> <li><a href="#what-is-search">What is search anyway?</a></li> <li><a href="#who-is-using-search">Who/What is using search?</a></li> <li><a href="#what-about-performance">What about performance?</a></li> <li><a href="#how-volatile">How volatile is the content?</a></li> </ul> </li> <li> <a href="#etl">The ETL</a> <ul> <li><a href="#content-stream">Real-time Content Stream</a></li> <li><a href="#configuration-stream">Configuration Stream</a></li> </ul> </li> <li> <a href="#operational-overview">Operational Overview</a> <ul> <li><a href="#configuration-mutations">Configuration Mutations</a></li> <li><a href="#configuration-predicates">Configuration Predicates</a></li> </ul> </li> <li><a href="#old-etl">The Old ETL</a></li> <li> <a href="#battle-of-storage-engines">The Battle of Storage Engines</a> <ul> <li><a href="#benchmark-setup">Benchmark Setup</a></li> <li><a href="#benchmark-results">Benchmark Results</a></li> </ul> </li> <li> <a href="#new-etl">The New ETL</a> <ul> <li><a href="#primary-storage-elasticsearch">The Primary Storage: Elasticsearch</a></li> <li><a href="#configuration-dsl-json-groovy">The Configuration DSL: JSON and Groovy</a></li> </ul> </li> <li><a href="#conclusion">Conclusion</a></li> <li><a href="#acknowledgements">Acknowledgements</a></li> <li><a href="#faq">F.A.Q</a></li> </ul> <p><a name="search"></a></p> <h1 id="the-search">The Search</h1> <p><i>[Before going any further, I want to take this opportunity to align you on what exactly I do mean by <em>search</em>. I hope this will help you to better wrap your mind around the ultimate consumer of ETL. That being said, feel free to skip this section and directly jump to the ETL deep dive in the next section.]</i></p> <p>Many people tend to make the mistake of having a narrow view on search at e-commerce and confining its use case to a mere term scavenging in a mountainous stack of product attributes. While this statement holds to a certain extent, it resembles a cherry located at the tip of an iceberg. (In <a href="/blog/post/2018/02/17/varnishing-search-performance/">Varnishing Search Performance</a> presentation, I tried to summarize how difficult it can get just to add a caching layer between your search logic and backend.) There are books written, university lectures offered, and computer science branches dedicated on the matter. But let me try to briefly elaborate this from an engineering standpoint.</p> <p><a name="what-is-search"></a></p> <h2 id="what-is-search-anyway">What is search anyway?</h2> <p>If I would try to give a general, but far from complete, overview, it enables one to</p> <ul> <li> <p>search for a term in hundreds of product attributes, where <em>matching</em> and <em>ranking</em> are curated with directly or indirectly available consumer (are you a PS4 owner searching for the newest “Call of Duty”?) and relevance (you probably meant a band by typing “The Doors”, which is irrelevant for “Doors &amp; Windows” department) contexts,</p> </li> <li> <p>browse (basically a search without a term) in thousands of categories with similar ranking mechanics used in search aforementioned,</p> </li> <li> <p>beam up directly to a certain product or category given the input matches with certain patterns (EAN, ISBN, ISSN, etc.) or merchandising rules (any syntactic and/or semantic combination of “wine glasses” should end the flow in a particular department, etc.),</p> </li> <li> <p>implicitly trigger multiple searches under the hood (e.g. narrowing down to a lower category or widening up to a higher category, etc.) to enhance the results,</p> </li> <li> <p>and decorate every listing with faceting (you probably want to see “Capacity” facet rather than “Shoe Size” while searching/browsing in “Harddisks”) support.</p> </li> </ul> <p><a name="who-is-using-search"></a></p> <h2 id="whowhat-is-using-search">Who/What is using search?</h2> <p>This is a big debate. But I know a handful of certain consumers:</p> <ul> <li> <p><strong>Customers:</strong> People who search and buy goods. They look harmless, until one gets exposed to them on a <a href="https://en.wikipedia.org/wiki/Black_Friday_%28shopping%29">Black Friday</a> where they work hand to hand in masses to <a href="https://en.wikipedia.org/wiki/Denial-of-service_attack">DDoS</a> the entire infrastructure.</p> </li> <li> <p><strong>Bots:</strong> They periodically (a couple of times a day at most, as of the date of this writing) try to digest your entire catalog into their system for two main purposes:</p> <ul> <li>Integrate the catalog into their own search engine (that is, Google),</li> <li>Tune their pricing strategy (that is, competitors)</li> </ul> <p>The worst part of handling bot traffic is you cannot always throttle them (for instance, Google takes into account website latencies for rankings) and you need to make sure they do not harm the customer traffic. Food for thought: Imagine your customers swarming at your shop at Christmas Eve and Google decided to spider your entire catalog with thousands of requests per second.</p> </li> <li> <p><strong>Partners:</strong> Your business partners can also scan your catalog periodically to integrate into their own systems. (Fun fact: Some even require a daily Excel export.) One can classify them as bots only interested in a subset of the data.</p> </li> <li> <p><strong>Internal services:</strong> Last time I counted, there were 20+ internal services using search to enhance their results in addition to the users I listed above. Their usage can constitute up to 50% of the traffic.</p> </li> </ul> <p>In the case of partners and internal services, one might argue why do they need the search data rather than directly accessing the raw product attributes and offers. The answer is simple: They also use additional attributes (e.g., facets, categories) incorporated at the ETL pipeline. Hence, rather than exposing the internal ETL system to them, it is more convenient to manage them at the search gateway which is known to have battle-tested scalability and resiliency measures.</p> <p><a name="what-about-performance"></a></p> <h2 id="what-about-performance">What about performance?</h2> <p>As decades-long experience in this domain points, making search 10ms faster can yield millions of euros extra revenue depending on the scale of your business. Unfortunately, this equation works the other way around as well. Hence, you are always expected to perform under a certain latency and above a certain throughput threshold.</p> <p><a name="how-volatile"></a></p> <h2 id="how-volatile-is-the-content">How volatile is the content?</h2> <p>Very, very, very volatile! I cannot emphasize this enough and I believe this is a crucial difference that puts e-commerce search apart from Google-like search engines – recall the conflict between Google and Twitter for indexing tweets. Maybe examples can help to convey the idea better:</p> <ul> <li> <p>A product might have multiple offers (bol.com offer, partner offer, etc.) featuring varying properties (pricing, deliverability, discounts, etc.) where both offers and/or their properties are highly volatile. The offer might run out of stock, the price might change, etc. While customer-facing web pages are enhanced with the most recent data at runtime, search index might lag behind and provide an eventually consistent view. The volatility in this context might range from seconds to months. On prime time, e.g. on Valentine’s Day, you don’t want your search engine to return gift listings that ran out of stock a couple of seconds ago.</p> </li> <li> <p>Your manual (triggered by shop specialists) and automated (artificial intelligence, machine learning driven) processes can alter the category tree, add new facets, tune the exposure of existing facets, modify the search behavior (e.g., flows triggered by merchandising rules), add context sensitive (e.g. category-dependent) thesaurus entries, synonyms, introduce new rankings, etc. These changes might necessitate the update of millions of documents retroactively.</p> </li> </ul> <p>This <em>volatility</em> debate will take a prominent role while deciding on the architecture of the next ETL pipeline, which I will elaborate in a minute.</p> <p><a name="etl"></a></p> <h1 id="the-etl">The ETL</h1> <p>In the domain of search at e-commerce, <a href="https://en.wikipedia.org/wiki/Extract,_transform,_load">ETL</a> denotes the pipeline where the input is a multitude of information sources (product attributes, offers, discounts, rankings, facets, synonyms, thesaurus entries, etc.) and the output is the <a href="https://en.wikipedia.org/wiki/Denormalization">denormalized</a> input constituting search-ready documents optimized for search query performance. Wait a second? If an ETL pipeline just delivers some optimization purposes, doesn’t this sound like that one can have a search without it? Sorta… That is indeed possible to a certain extent. If we would put the details aside for a moment, we can roughly compare the two approaches as follows:</p> <table> <thead> <tr> <th>Strategy</th> <th>Advantages</th> <th>Disadvantages</th> </tr> </thead> <tbody> <tr> <td><strong>Without ETL</strong></td> <td> Every change in the input sources take immediate effect. (Hence, almost zero index time cost.) </td> <td> Latency and throughput hurts dramatically due to necessitated join and enrich operations on input sources at query time. </td> </tr> <tr> <td><strong>With ETL</strong></td> <td> Since all potential data to satisfy search requests has already been baked into the index, search necessitates the least amount of effort to satisfy a request at query time. </td> <td> Every change in the input sources will necessitate pre-processing affecting a multitude of products ranging from a couple to millions. </td> </tr> </tbody> </table> <p>Put another way, ETL is all about the trade-off between index- versus query-time performance. In the light of all these and given</p> <ol> <li>our existing ETL was functionally comprehensive enough,</li> <li>query time performance of Elasticsearch has already been suffering due to faceting, internally triggered queries, etc. to an extent external caching becomes a necessity,</li> <li>and search latency has a big impact on the revenue,</li> </ol> <p>we took the thick ETL pipeline path.</p> <p>But what is this ETL pipeline really? What does it literally do? In order to answer these questions, let me focus your attention to the input sources going into the ETL pipeline:</p> <p><img src="etl.jpg" alt="ETL Input Sources"></p> <p><i>[GPC stands for <a href="https://www.gs1.org/standards/gpc">Global Product Classification</a>, which is de facto commercial categorization of goods varying from a car to a litre of milk.]</i></p> <p>These two input sources, content and configuration, feature two totally different execution patterns framing the functional requirements of the potential ETL solutions, hence, play the uttermost critical role in justifying the plan we picked. Let’s examine them further:</p> <p><a name="content-stream"></a></p> <h2 id="real-time-content-stream">Real-time Content Stream</h2> <p>Here the ETL pipeline listens from more than a dozen queues for updates ranging from product attributes to offers, offer-specific discounts to rankings, etc. all formatted in <a href="https://json.org/">JSON</a>. Fortunately, each real-time content stream message triggers a single product update. Let me exemplify this with a case: when <code>disk_capacity_bytes</code> attribute of a product changes, we</p> <ol> <li>first fetch the relevant document from the storage,</li> <li>update its <code>disk_capacity_bytes</code> attribute,</li> <li>apply configuration(s) matching with the last state of the updated document,</li> <li>and persist the obtained result back.</li> </ol> <p>There are some concerns need to be addressed here:</p> <ul> <li> <p>This is a pretty <em>CPU intensive</em> operation. Configurations, in essence, are rules in the form of <code>(predicate, mutation)</code> pairs defined via business-friendly screens by shop specialists. When an attribute of a document gets updated, this change might be of interest to many configurations which are determined by performing an inverse lookup on tens of thousands of configuration predicates (e.g., <code>attrs.disk_capacity_bytes != null</code>) matching with the last state of the document. Later on mutations (e.g., <code>doc.disk_capacity_gigabytes = attrs.disk_capacity_bytes / 1e9</code>) of the found configurations are executed to let them shape the document according to their needs.</p> <p>This innocent looking procedure sneakily introduces two critical issues under the hood:</p> <ol> <li><em>How would you represent the configuration predicate such that you can match them against the content?</em></li> <li><em>How would you represent the configuration mutation such that you can execute them against the content?</em></li> </ol> <p>And it goes without saying, both concerns aforementioned need to be engineered efficiently. You are expected to repeat this procedure on each message JSON of the real-time content stream where the traffic is in the order of millions per day.</p> <p>As a concrete configuration example consider the following: You have two “Disk Capacity” facets defined by business: one for computers, one for smart phones departments. The first one translates the <code>disk_capacity_bytes</code> into a <code>disk_capacity_terabytes</code> attribute which is defined to be exposed when <code>category == "computers"</code> and the second translates into a <code>disk_capacity_gigabytes</code> attribute which is defined to be exposed when <code>category == "smart phones"</code>. Here both configurations are executed when the <code>attrs.disk_capacity_bytes != null</code> predicate holds.</p> </li> <li> <p>This operation needs to be performed <em>atomically</em>. Two concurrent operations touching to the same product should not result in a corrupt content.</p> </li> </ul> <p><a name="configuration-stream"></a></p> <h2 id="configuration-stream">Configuration Stream</h2> <p>Configurations are the rules defined via business-friendly screens. There modifications done by shop specialists are published in snapshots when they think the changes grow into a stable state that they are ready to be exposed to the customer. Each published configuration snapshot ends up serving three purposes:</p> <ol> <li>search gateway uses it to determine how to query the search index,</li> <li>ETL pipeline uses it to process the real-time content stream,</li> <li>and ETL pipeline <em>retroactively updates</em> the documents that are potentially affected.</li> </ol> <p>While the first two are relatively cheap operations, the last one is the elephant in the room! This is the first time in our beautiful tale described so far that we need to propagate a change to millions of documents. Let me further explain this in an example:</p> <p>Let’s consider that the following category definition:</p> <pre><code class="language-javascript"><span class="k">if</span> <span class="p">(</span><span class="nx">attrs</span><span class="p">.</span><span class="nx">gpc</span><span class="p">.</span><span class="nx">family_id</span> <span class="o">==</span> <span class="mi">1234</span> <span class="o">&amp;&amp;</span> <span class="nx">attrs</span><span class="p">.</span><span class="nx">gpc</span><span class="p">.</span><span class="nx">chunk_id</span> <span class="o">==</span> <span class="mi">5678</span><span class="p">)</span> <span class="p">{</span> <span class="nx">doc</span><span class="p">.</span><span class="nx">category</span> <span class="o">=</span> <span class="s2">"books"</span> <span class="p">}</span></code></pre> <p>is modified as follows:</p> <pre><code class="language-javascript"><span class="k">if</span> <span class="p">(</span><span class="nx">attrs</span><span class="p">.</span><span class="nx">gpc</span><span class="p">.</span><span class="nx">family_id</span> <span class="o">==</span> <span class="mi">1234</span> <span class="o">&amp;&amp;</span> <span class="nx">attrs</span><span class="p">.</span><span class="nx">gpc</span><span class="p">.</span><span class="nx">chunk_id</span> <span class="o">==</span> <span class="mh">0xDEADBEEF</span><span class="p">)</span> <span class="p">{</span> <span class="nx">doc</span><span class="p">.</span><span class="nx">category</span> <span class="o">=</span> <span class="s2">"AWESOME BOOKS"</span> <span class="p">}</span></code></pre> <p>Sir, you are in trouble! As the very ETL pipeline, what you are expected to deliver is to</p> <ol> <li>find products that are matching with the old predicate,</li> <li>revert the changes of the old configuration mutation by removing <code>books</code> from the <code>category</code> field,</li> <li>find products that are matching with the new predicate,</li> <li>and apply the changes of the new configuration mutation by adding <code>AWESOME BOOKS</code> to the <code>category</code> field.</li> </ol> <p>This easier said than done operation contains many implicit concerns:</p> <ul> <li> <p>ETL needs to avoid removing <code>books</code> from the <code>category</code> field if there are rules, other than the changed one, adding <code>books</code> to the very same <code>category</code> field. There are two ways you can approach to this:</p> <ol> <li> <p>With every value added to a field, store a meta information pointing to the rules associated with that value. These back-tracking pointers optimize the check whether a value can be removed or not, with the cost of maintaining them in an ocean of values.</p> </li> <li> <p>After removing every value, put the product back into the ETL pipeline just like handling products in the real-time content stream. If there are any rules, other than the changed one, adding <code>books</code> to the very same <code>category</code> field, they will kick in. This simple approach comes with the cost of a CPU intensive and unfortunately mostly redundant processing.</p> </li> </ol> </li> <li> <p>Given that configuration predicates are allowed to access any field, how would one represent a predicate and translate this into an ETL storage query filter that performs well? (You would not want to scan the whole data set for each predicate that is changed, right? Well… depends.)</p> <p>Let’s first discuss the representation of predicates issue, which was also a concern in the real-time content stream processing. Here you might first fall into the trap of whitelisting the operators (<code>==</code>, <code>!=</code>, <code>&gt;</code>, <code>&gt;=</code>, <code>&lt;</code>, <code>&lt;=</code>, <code>~=</code>) and the content attributes (<code>attrs.gpc.family_id</code>, <code>attrs.gpc.chunk_id</code>, <code>attrs.disk_capacity_bytes</code>, etc.) that are allowed in configuration predicates. While whitelisting operators is fine, whitelisting the content attributes implies that the ETL pipeline, the configuration administration GUIs, etc. all needs to have the knowledge of this whitelist which strictly depends on the structure of the real-time content stream message structures. Whenever the message structures change or you want to add a new attribute to this whitelist, both happen a couple of times every year, you need to propagate this to many components in your service milky way and perform a deploy without downtime.</p> <p>What about translating these predicate representations into efficient ETL storage query filters? Let’s take the simplest approach: Represent each attribute with a separate field. Then let me ask you the following questions:</p> <ol> <li> <p>If you would opt for using an RDBMS, you can represent attributes by columns and create an index for each individual column. (Ouch!) Thanks to the half-century battle-tested RDBMS literature, the database can easily optimize and perform an index scan for the constructed queries:</p> <pre><code class="language-sql"><span class="k">SELECT</span> <span class="p">...</span> <span class="k">FROM</span> <span class="n">content</span> <span class="k">WHERE</span> <span class="n">attrs_gpc_family_id</span> <span class="o">=</span> <span class="s1">'1234'</span> <span class="k">AND</span> <span class="n">attrs_gpc_chunk_id</span> <span class="o">=</span> <span class="s1">'5678'</span></code></pre> <p>That being said… What if you hit to the maximum column count limitation? (Yes, we did!) Further, what about attributes that are list of objects:</p> <pre><code class="language-json"><span class="p">{</span> <span class="nt">"authors"</span><span class="p">:</span> <span class="p">[</span> <span class="p">{</span> <span class="nt">"fname"</span><span class="p">:</span> <span class="s2">"Volkan"</span><span class="p">,</span> <span class="nt">"lname"</span><span class="p">:</span> <span class="s2">"Yazici"</span> <span class="p">},</span> <span class="p">{</span> <span class="nt">"fname"</span><span class="p">:</span> <span class="s2">"Lourens"</span><span class="p">,</span> <span class="nt">"lname"</span><span class="p">:</span> <span class="s2">"Heijs"</span> <span class="p">}</span> <span class="p">]</span> <span class="p">}</span></code></pre> <p>You definitely cannot store these in a single column and still query each individual component. Ok, then you can normalize the data as follows:</p> <pre><code class="language-sql"><span class="k">SELECT</span> <span class="p">...</span> <span class="k">FROM</span> <span class="n">content</span><span class="p">,</span> <span class="n">attribute</span> <span class="k">AS</span> <span class="n">a1</span><span class="p">,</span> <span class="n">attribute</span> <span class="k">AS</span> <span class="n">a2</span> <span class="k">WHERE</span> <span class="n">a1</span><span class="p">.</span><span class="n">content_id</span> <span class="o">=</span> <span class="n">content</span><span class="p">.</span><span class="n">id</span> <span class="k">AND</span> <span class="n">a1</span><span class="p">.</span><span class="n">name</span> <span class="o">=</span> <span class="s1">'gpc_family_id'</span> <span class="k">AND</span> <span class="n">a1</span><span class="p">.</span><span class="n">value</span> <span class="o">=</span> <span class="s1">'1234'</span> <span class="k">AND</span> <span class="n">a2</span><span class="p">.</span><span class="n">content_id</span> <span class="o">=</span> <span class="n">content</span><span class="p">.</span><span class="n">id</span> <span class="k">AND</span> <span class="n">a2</span><span class="p">.</span><span class="n">name</span> <span class="o">=</span> <span class="s1">'gpc_chunk_id'</span> <span class="k">AND</span> <span class="n">a2</span><span class="p">.</span><span class="n">value</span> <span class="o">=</span> <span class="s1">'5678'</span></code></pre> <p>So far so good. But… In a matter of months, you will need to start partitioning tables and maybe even move certain partitions into separate database instances to maintain the latency under a certain threshold. (Yes, we did this as well!) But this never-ending database structure optimization more and more feels like you are inventing your own distributed database using a plain RDBMS. Does this really still need to be this way in 2018?</p> </li> <li> <p>If you would opt for using <a href="https://www.mongodb.com/">MongoDB</a>, like using an RDBMS, you still need create an explicit index on each (whitelisted) field. For filters involving multiple fields (e.g., <code>attrs.gpc.family_id == 1234 &amp;&amp; attrs.gpc.chunk_id == 5678</code>), MongoDB query optimizer can purpose individual field indices via <a href="https://docs.mongodb.com/manual/core/index-intersection/">index intersection</a>. That being said, our experience with this feature has not been very pleasant.</p> <p>The issue where attributes might contain list of objects is <a href="https://docs.mongodb.com/manual/tutorial/query-array-of-documents/">not a problem for MongoDB</a>.</p> </li> <li> <p>If you would opt for <a href="https://cloud.google.com/datastore">Google Cloud Datastore</a>, you will need to create explicit indices for each potential filter combination and order matters! Yes, you read that right! Let me exemplify this bizarre situation. If you have configurations with the following predicates:</p> <ul> <li><code>attrs.gpc.family_id == 1234</code></li> <li><code>attrs.gpc.chunk_id == 5678</code></li> <li><code>attrs.gpc.family_id == 1234 &amp;&amp; attrs.gpc.chunk_id == 5678</code></li> <li><code>attrs.gpc.chunk_id == 5678 &amp;&amp; attrs.gpc.family_id == 1234</code></li> </ul> <p>you need to define 4 different indices! Ouch! This in its own was a Datastore show stopper for us.</p> </li> <li> <p>If you would opt for <a href="https://www.elastic.co/products/elasticsearch">Elasticsearch</a>, all fields are indexed by default and you can use them in any combination! Yay! No need for whitelisting! And similar to MongoDB, Elasticsearch also allows <a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/nested.html">querying list of objects</a>, you just need to declare them explicitly as <code>nested</code>. If you don’t even want to worry about that, you can add a dynamic mapping template to make each object nested by default. Following is the index mapping you can use for that purpose:</p> <pre><code class="language-json"><span class="p">{</span> <span class="nt">"date_detection"</span><span class="p">:</span> <span class="kc">false</span><span class="p">,</span> <span class="nt">"dynamic_templates"</span><span class="p">:</span> <span class="p">[</span> <span class="p">{</span> <span class="nt">"strings"</span><span class="p">:</span> <span class="p">{</span> <span class="nt">"match_mapping_type"</span><span class="p">:</span> <span class="s2">"string"</span><span class="p">,</span> <span class="nt">"mapping"</span><span class="p">:</span> <span class="p">{</span> <span class="nt">"type"</span><span class="p">:</span> <span class="s2">"keyword"</span> <span class="p">}</span> <span class="p">}</span> <span class="p">},</span> <span class="p">{</span> <span class="nt">"objects"</span><span class="p">:</span> <span class="p">{</span> <span class="nt">"match_mapping_type"</span><span class="p">:</span> <span class="s2">"object"</span><span class="p">,</span> <span class="nt">"mapping"</span><span class="p">:</span> <span class="p">{</span> <span class="nt">"type"</span><span class="p">:</span> <span class="s2">"nested"</span> <span class="p">}</span> <span class="p">}</span> <span class="p">}</span> <span class="p">]</span> <span class="p">}</span></code></pre> <p>Above mapping also disables analyzing the fields of type <code>string</code>, since we are not interested in performing fuzzy queries. Clearly, date detection is disabled for similar reasons.</p> <p>These being said, Elasticsearch is known to suffer from deteriorating query performance over time when exposed to high update rates.</p> </li> </ol> </li> </ul> <p><a name="operational-overview"></a></p> <h1 id="operational-overview">Operational Overview</h1> <p>So far we examined the current ETL setup with concrete examples for several cases. We broke down the system into its individual input sources and detailed their implications on certain architectural decisions. Let’s wrap up this mind-boggling details into operational abstractions:</p> <p><img src="etl-abstraction.jpg" alt="The ETL: Operational Overview"></p> <p>Given these operational abstractions, let me summarize the constraints the configuration components (predicate and mutation) imply.</p> <p><a name="configuration-mutations"></a></p> <h2 id="configuration-mutations">Configuration Mutations</h2> <p>If you would recall, configuration mutations were simple document enhancement instructions that I exemplified as follows:</p> <pre><code class="language-javascript"><span class="nx">doc</span><span class="p">.</span><span class="nx">category</span> <span class="o">=</span> <span class="s2">"books"</span></code></pre> <p>Here <code>doc</code> is a dictionary denoting the ETL’ed document source and mutation “adds” <code>books</code> value to its <code>category</code> field. This (for simplification purposes, JavaScript-employed) innocent looking expression can (and does!) go to unintended extents:</p> <pre><code class="language-javascript"><span class="k">if</span> <span class="p">(</span><span class="nx">attrs</span><span class="p">.</span><span class="nx">suitable_for_month</span> <span class="o">&lt;=</span> <span class="mi">2</span><span class="p">)</span> <span class="p">{</span> <span class="nx">doc</span><span class="p">.</span><span class="nx">childhood_stage</span> <span class="o">=</span> <span class="s2">"newborn"</span><span class="p">;</span> <span class="p">}</span> <span class="k">else</span> <span class="k">if</span> <span class="p">(</span><span class="nx">attrs</span><span class="p">.</span><span class="nx">suitable_for_month</span> <span class="o">&lt;=</span> <span class="mi">12</span><span class="p">)</span> <span class="p">{</span> <span class="nx">doc</span><span class="p">.</span><span class="nx">childhood_stage</span> <span class="o">=</span> <span class="s2">"infant"</span><span class="p">;</span> <span class="p">}</span> <span class="k">else</span> <span class="k">if</span> <span class="p">(</span><span class="nx">attrs</span><span class="p">.</span><span class="nx">suitable_for_month</span> <span class="o">&lt;=</span> <span class="mi">48</span><span class="p">)</span> <span class="p">{</span> <span class="nx">doc</span><span class="p">.</span><span class="nx">childhood_stage</span> <span class="o">=</span> <span class="s2">"toddler"</span><span class="p">;</span> <span class="p">}</span></code></pre> <p>The choice of the mutation <a href="https://en.wikipedia.org/wiki/Domain-specific_language">DSL</a> employed is expected to deliver the following requirements:</p> <ul> <li>It “must” support JSON input and output for the real-time content stream. (See step B4 in the figure.)</li> <li>It “should” support ETL storage input and output for the configuration snapshot stream. (See step A4 in the figure.)</li> </ul> <p>The reason that the latter functionality marked as optional is that the ETL pipeline can also retrieve these documents in raw from the storage, convert them to JSON, execute mutations, and persist them back again – assuming data integrity is provided by other means, e.g., transactions, retries powered by compare-and-swap operations, etc.</p> <p><a name="configuration-predicates"></a></p> <h2 id="configuration-predicates">Configuration Predicates</h2> <p>Configuration predicates were simple conditions restricted to use a whitelisted set of operators (<code>==</code>, <code>!=</code>, <code>&gt;</code>, <code>&gt;=</code>, <code>&lt;</code>, <code>&lt;=</code>, <code>~=</code>) supporting grouping:</p> <pre><code class="language-javascript"><span class="nx">attrs</span><span class="p">.</span><span class="nx">gpc</span><span class="p">.</span><span class="nx">family_id</span> <span class="o">==</span> <span class="mi">1234</span> <span class="o">&amp;&amp;</span> <span class="nx">attrs</span><span class="p">.</span><span class="nx">gpc</span><span class="p">.</span><span class="nx">chunk_id</span> <span class="o">==</span> <span class="mi">5678</span></code></pre> <p>Similar to mutations, the choice of the predicate DSL used is expected to deliver the following requirements:</p> <ul> <li>It “must” support JSON input for the real-time content stream. (See step B2 in the figure.)</li> <li>It “should” support ETL storage input for determining the affected documents by the configuration snapshot delta. (See step A4 in the figure.)</li> </ul> <p>We relaxed the latter constraint since one can very well prefer to put the entire stored document collection (Ouch!) back into the ETL pipeline, process them, detect the changed ones, and persist the updates. This approach has certain assumptions though:</p> <ul> <li>We don’t need to perform this too often. That is, the frequency of configuration snapshots are relatively low, e.g., max. a couple of times a day.</li> <li>The snapshot deltas affect a significant percentage of the entire collection to an extent that the advantage of finding and processing only the affected documents diminishes.</li> </ul> <p>Given you still need to make a back of the envelope calculation on your cloud bill for each approach, our years of statistics in the ETL snapshot configuration point that most of the time snapshot deltas affect at most 5% of the entire collection and the average is less than 1% – thanks to the incremental updates carried out by shop specialists. Hence, performing a complete ETL a couple of times a day feels like overkill and hurts the engineer within you.</p> <p><a name="old-etl"></a></p> <h1 id="the-old-etl">The Old ETL</h1> <p>The old ETL was a single Oracle database where the configurations were modeled in PL/SQL. Since the configuration abstraction language was the very same language the database uses itself, executing mutations and predicates was effortless. Hail <a href="https://en.wikipedia.org/wiki/SQL_injection">SQL injection</a> as a feature! Though this came with some notable costs:</p> <ul> <li>Using PL/SQL within the abstraction model created both functional and financial vendor lock-in. The functional deficiency (incompetent expressiveness, leakage of PL/SQL to irrelevant components) obstructed many innovations over the years, where it became more and more difficult as time passed. Additionally, it constituted a significant obstacle for migrating the service to the cloud. Its financial aspect was negligible at the scale of <a href="https://bol.com">bol.com</a>.</li> <li>Rolling back changes of an updated configuration mutation is quite a PL/SQL engineering endeavor to implement in practice. This difficulty, spiced up with the insufficient logging, testing, debugging, profiling, etc. utilities, drew programmers back from taking this path. <em>Hence, there was a 12+ hours long complete ETL run every night for configuration snapshot deltas.</em> This beast tamed by an experienced couple of engineers has a reputation to have frequent hiccups and make bugs really difficult to debug, find, and reproduce, let alone fix!</li> </ul> <p>In its previous incarnation, the content attributes were stored in <code>&lt;id, content_id, key, value&gt;</code> normalized form. This approach started to suffer from efficiency aches in the hinges pulling the ETL’ed data to the search index. Back then hired Oracle consultants examined the usage and recommended to go with a denormalized structure where each attribute is stored as a column. In addition to temporarily bandaging up the efficiency related wounds, this allowed DBAs to let their imaginations go wild to map the attributes to columns. Recall the attributes composed of objects I mentioned above? Special characters were used to create such multi-value attributes, which was pretty much (to put it mildly) unpleasant. But the killer bullet came in the form of a six-inch punch referred as <a href="https://stackoverflow.com/a/14722914/1278899">the maximum allowed column count limit</a>. But isn’t engineering all about <a href="https://youtu.be/D_Vg4uyYwEk">how hard you can get it and keep moving forward</a>? Yes, comrade! We thought so and used a single binary XML column to store attributes, queried them using Oracle XPath toolbox, escaped attribute values, finally concatenated them into SQL strings that are eventually executed, and for sure crossed our fingers.</p> <p>There are a couple of important details that I could not manage to cover in the above war diary without spoiling the coherency. Let me drop them here in no particular order:</p> <ul> <li>Task parallelization is pretty difficult in PL/SQL. We tried patching this hole via internal Oracle AQs, but I am not really sure whether it improved or worsened the state.</li> <li>In a database procedure that is expected to run for 12+ hours, Murphy’s law works flawlessly. Anything that can go wrong, did, does, and will go wrong. We wisely(!) engineered the system to persist its state at certain check points constituting retriable handles to invoke when you come in the morning and see that the ETL crashed.</li> <li>The number of moving components necessitated the use of <a href="https://www.cronacle.com/">a proprietary scheduling tool supporting Oracle</a>. The schedule was glued with <a href="https://www.gnu.org/software/bash/">bash</a> scripts, designed in a proprietary development environment only available for Windows, and rolled out on Oracle machines running GNU/Linux. Neither GNU/Linux, nor Windows using developers were fond of this situation.</li> <li>Due to the high cost of a failing ETL, business also did not feel empowered to change and/or commercially optimize it easily. This was a pretty demotivating issue affecting both technical and business people need to work with it.</li> </ul> <p>Enough blaming the former engineer. We need to get our facts right. The aforementioned PL/SQL giant was not rolled out in a day with a big bang. This more than a decade old ETL pipeline was developed with all the best practices and tooling available back then. The more you dive into its source code, navigate through commits of features spanning through years, it becomes easier to see what went wrong and where. Now you are able to realize the patterns that necessitated exceptional handling of certain features, of which many due to backward-compatibility with legacy systems that have already been deprecated or replaced by newcomers, exploded the complexity to unintended depths. Software development is never-ending progress and axioms you base your initial architecture on become invalidated in the course of time due to changing business needs. Aiming for infinite flexibility comes with an engineering cost as well, which might very well fall short of justifying such an expense. One should also include the massive burst of data volume and its update frequency into this list. I personally think the old ETL pipeline and its engineers did a fantastic job. The tool served its purpose for more than a decade and harvested an immense amount of lessons for its successor. I would be more than happy if we as a team can also achieve to deliver such a long living product.</p> <p><a name="battle-of-storage-engines"></a></p> <h1 id="the-battle-of-storage-engines">The Battle of Storage Engines</h1> <p>Given our functional requirements, we evaluated a couple of different ETL pipeline storage solutions which I <a href="#configuration-stream">hinted to earlier</a>. Following is the feature matrix of each candidate:</p> <table> <thead> <tr> <th>Storage Solution</th> <th>Distributed</th> <th>Sharded</th> <th>Required Indices</th> <th>Integrity Measure</th> </tr> </thead> <tbody> <tr> <td>PostgreSQL</td> <td>No</td> <td>No</td> <td>One<sup>1</sup> </td> <td>Transactions</td> </tr> <tr> <td>PostgreSQL (partitioned)</td> <td>No</td> <td>Yes<sup>2</sup> </td> <td>One<sup>1</sup> </td> <td>Transactions</td> </tr> <tr> <td>MongoDB</td> <td>Yes</td> <td>Yes<sup>3</sup> </td> <td>Some<sup>4</sup> </td> <td>Compare-and-swap<sup>5</sup> </td> </tr> <tr> <td>Elasticsearch</td> <td>Yes</td> <td>Yes</td> <td>None</td> <td>Compare-and-swap<sup>6</sup> </td> </tr> </tbody> </table> <p><sup>1</sup> PostgreSQL <code>jsonb</code> index covers all fields.<br> <sup>2</sup> PostgreSQL partitioning is not sharding in distributed sense, but still serves a similar purpose.<br> <sup>3</sup> MongoDB sharding requires <a href="https://docs.mongodb.com/manual/sharding/#shard-keys">manual configuration</a>.<br> <sup>4</sup> MongoDB requires an explicit index for each whitelisted field allowed in ETL configuration predicates.<br> <sup>5</sup> MongoDB <a href="https://docs.mongodb.com/manual/core/write-operations-atomicity/"><code>updateMany()</code> or <code>findAndModify()</code></a> can be leveraged for the desired integrity.<br> <sup>6</sup> Elasticsearch <code>_version</code> field can be leveraged to implement a compare-and-swap loop.</p> <p><a name="benchmark-setup"></a></p> <h2 id="benchmark-setup">Benchmark Setup</h2> <p>For the benchmark, we populated each store with 33 million JSON documents of which each weighs an average size of 2.5KB. One of the contrived fields in the document is <code>search_rank</code>. Later on, a file consisting of 6 million distinct <code>&lt;id, search_rank&gt;</code> pairs is streamed in batches of size 1000. For each batch, we first fetch the old <code>search_rank</code>s associated with the <code>id</code>s and then bulk update these with the new <code>search_rank</code>s. In this scenario, what we tried to emulate is a bulk update triggered by a configuration snapshot delta, which is the most storage performance demanding operation in the ETL pipeline.</p> <p>Used test bed is a cluster composed of 6 dedicated machines with the following specifications:</p> <ul> <li> <strong>CPU</strong>: 16 core Intel Xeon E5-2620 v4 @ 2.10GHz</li> <li> <strong>Memory/Swap</strong>: 128GB/16GB</li> <li> <strong>Disk</strong>: 375GB (Intel P4800X Performance NVMe PCIe SSD)</li> <li> <strong>Kernel</strong>: 3.10.0-693.1.1.el7.x86_64</li> </ul> <p>We further configured each store as follows:</p> <ul> <li> <p><strong>PostgreSQL</strong>: Just one PostgreSQL 9.6.10 instance containing a single <code>&lt;id, content&gt;</code> table where <code>content</code> is of type <a href="https://www.postgresql.org/docs/current/datatype-json.html#JSON-INDEXING"><code>jsonb</code></a>. Benchmark configured to update only the <code>search_rank</code> attribute of the <code>content</code> column.</p> </li> <li> <p><strong>PostgreSQL (partitioned)</strong>: Same as above, but the <code>content</code> table is partitioned into 10 tables.</p> </li> <li> <p><strong>MongoDB</strong>: MongoDB 3.6 with the following configurations:</p> <pre><code class="language-yaml"><span class="l-Scalar-Plain">systemLog.destination</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">file</span> <span class="l-Scalar-Plain">systemLog.logAppend</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">true</span> <span class="l-Scalar-Plain">processManagement.fork</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">true</span> <span class="l-Scalar-Plain">storage.engine</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">wiredTiger</span> <span class="l-Scalar-Plain">security.authorization</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">enabled</span> <span class="l-Scalar-Plain">replication.oplogSizeMB</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">9216</span></code></pre> <p>Note that sharding is not enabled. (More on this later.)</p> <p>Similar to PostgreSQL setup, benchmark configured to update only the <code>search_rank</code> attribute of documents.</p> </li> <li> <p><strong>Elasticsearch</strong>: Elasticsearch 6.3.0 with the following JVM flags:</p> <pre><code>-Xms30g -Xmx30g -Xss256k -XX:NewRatio=3 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintClassHistogram -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime </code></pre> <p>Here JVM heap size is set to 30G due to <a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/heap-size.html">compressed OOPs limitation</a>.</p> <p>Different from PostgreSQL and MongoDB setups, where only the <code>search_rank</code> attribute is updated, Elasticsearch benchmark is configured to update the entire document. While this overkill is subject to hammer Elasticsearch way heavier (since Elasticsearch will create quite some garbage segments waiting to be merged and making every object nested worsens the case even more) than other stores, it is more strategically aligned with how we want to use it in the future.</p> </li> </ul> <p><a name="benchmark-results"></a></p> <h2 id="benchmark-results">Benchmark Results</h2> <p>Below you will see the results of the benchmark for only MongoDB and Elasticsearch. The reason PostgreSQL results were omitted is no matter what kind of optimization we throw at it, the benchmark always took more than 2 hours, regardless of partitioning, whereas MongoDB and Elasticsearch took a couple of minutes.</p> <style> .concurrency { text-align: center; } .measurement { text-align: right; } .per-batch .measurement { font-weight: bold; } </style> <table> <thead> <tr> <th>Store</th> <th>Conc.<sup>7</sup> </th> <th>Latency</th> <th>Total (s)</th> <th>Fetch<sup>8</sup> 75% (ms)</th> <th>Fetch<sup>8</sup> 99% (ms)</th> <th>Fetch<sup>8</sup> Max. (ms)</th> <th>Update<sup>9</sup> 75% (ms)</th> <th>Update<sup>9</sup> 99% (ms)</th> <th>Update<sup>9</sup> Max. (ms)</th> </tr> </thead> <tbody> <tr> <td rowspan="6">MongoDB</td> <td rowspan="2" class="concurrency">8</td> <td>total</td> <td class="measurement">518</td> <td class="measurement">68</td> <td class="measurement">999</td> <td class="measurement">3380</td> <td class="measurement">64</td> <td class="measurement">2347</td> <td class="measurement">4153</td> </tr> <tr class="per-batch"> <td colspan="2">per batch</td> <td class="measurement">8</td> <td class="measurement">125</td> <td class="measurement">423</td> <td class="measurement">8</td> <td class="measurement">293</td> <td class="measurement">519</td> </tr> <tr> <td rowspan="2" class="concurrency">16</td> <td>total</td> <td class="measurement">526</td> <td class="measurement">71</td> <td class="measurement">3082</td> <td class="measurement">7905</td> <td class="measurement">68</td> <td class="measurement">5564</td> <td class="measurement">7955</td> </tr> <tr class="per-batch"> <td colspan="2">per batch</td> <td class="measurement">4</td> <td class="measurement">193</td> <td class="measurement">494</td> <td class="measurement">4</td> <td class="measurement">348</td> <td class="measurement">497</td> </tr> <tr> <td rowspan="2" class="concurrency">32</td> <td>total</td> <td class="measurement">518</td> <td class="measurement">61</td> <td class="measurement">6668</td> <td class="measurement">11465</td> <td class="measurement">98</td> <td class="measurement">10533</td> <td class="measurement">13784</td> </tr> <tr class="per-batch"> <td colspan="2">per batch</td> <td class="measurement">2</td> <td class="measurement">208</td> <td class="measurement">358</td> <td class="measurement">3</td> <td class="measurement">329</td> <td class="measurement">431</td> </tr> <tr> <td rowspan="6">Elasticsearch</td> <td rowspan="2" class="concurrency">8</td> <td>total</td> <td class="measurement">251</td> <td class="measurement">278</td> <td class="measurement">423</td> <td class="measurement">798</td> <td class="measurement">94</td> <td class="measurement">186</td> <td class="measurement">412</td> </tr> <tr class="per-batch"> <td colspan="2">per batch</td> <td class="measurement">35</td> <td class="measurement">53</td> <td class="measurement">100</td> <td class="measurement">12</td> <td class="measurement">23</td> <td class="measurement">52</td> </tr> <tr> <td rowspan="2" class="concurrency">16</td> <td>total</td> <td class="measurement">196</td> <td class="measurement">478</td> <td class="measurement">697</td> <td class="measurement">1004</td> <td class="measurement">141</td> <td class="measurement">266</td> <td class="measurement">410</td> </tr> <tr class="per-batch"> <td colspan="2">per batch</td> <td class="measurement">30</td> <td class="measurement">44</td> <td class="measurement">63</td> <td class="measurement">9</td> <td class="measurement">17</td> <td class="measurement">26</td> </tr> <tr> <td rowspan="2" class="concurrency">32</td> <td>total</td> <td class="measurement">175</td> <td class="measurement">951</td> <td class="measurement">1368</td> <td class="measurement">1515</td> <td class="measurement">214</td> <td class="measurement">331</td> <td class="measurement">828</td> </tr> <tr class="per-batch"> <td colspan="2">per batch</td> <td class="measurement">30</td> <td class="measurement">43</td> <td class="measurement">47</td> <td class="measurement">7</td> <td class="measurement">10</td> <td class="measurement">26</td> </tr> </tbody> </table> <p><sup>7</sup> Number of concurrent batches.<br> <sup>8</sup> Time it takes to fetch a batch.<br> <sup>9</sup> Time it takes to update a batch.</p> <p>Let me share some observations from the results:</p> <ul> <li> <p><strong>Increasing concurrency</strong> improves Elasticsearch performance (up to 32 concurrent batches) but does not have much effect on MongoDB.</p> </li> <li> <p><strong>Elasticsearch rocked in performance</strong> even though it is hammered with the update of the entire document whereas MongoDB is just trying to update a single attribute. Using 32 concurrent batches, it took 175s and 518s for Elasticsearch and MongoDB, respectively, to complete the benchmark.</p> </li> <li> <p><strong>Elasticsearch yielded way more predictable performance</strong> figures compared to MongoDB. Note the difference between 75- and 99-percentile figures.</p> </li> <li> <p><strong>Elasticsearch segment merges</strong> were unexpectedly pretty stable during the runs, whereas we were anticipating it to become the bottleneck due to high update rate. But compare-and-swap loops played over <code>_version</code> fields allowed for the necessary data integrity without breaking a sweat.</p> </li> </ul> <p>At the time of testing, we initially were not able to enable sharding in MongoDB due to operational obstacles on our side. Though Elasticsearch results were such promising, to the point of even shocking the hired Elasticsearch consultants, we decided to go with it, of which we have years of production experience. If we would put the necessity of whitelisted configuration predicate fields problem aside – that is, required explicit indices on what can be queried – MongoDB could very well be a viable option as well.</p> <p>But, really, why Elasticsearch has a reputation of not being recommended as a primary data store? I think it all started when the official project website years ago contained an explicit statement admitting that Elasticsearch is not intended to be used as a primary data store. Once, as the very owner of the project itself, you admit this fact, it is really difficult to convince people the other way around – even if the situation might have been improved. Later on, published <a href="https://jepsen.io/">Jepsen</a> (an effort to improve the safety of distributed databases, queues, consensus systems, etc.) reports (<a href="https://aphyr.com/posts/317-call-me-maybe-elasticsearch">one in 2014-06-15 using Elasticsearch 1.1.0</a> and the other <a href="https://aphyr.com/posts/323-call-me-maybe-elasticsearch-1-5-0">one in 2015-04-27 using Elasticsearch 1.5.0</a>) worsened the situation and this bad reputation disseminated over the web in the speed of light. While this tornado DDoS’ing the entire Hackernews, Proggit, etc. blogosphere with endless discussions in the form of <i>“See? I told ya so!”</i>, Elasticsearch team put up a <a href="https://www.elastic.co/guide/en/elasticsearch/resiliency/current/index.html">Elasticsearch Resiliency Status</a> page. There they started sharing (even up to today!) known resiliency problems, including the ones found in Jepsen reports, converting them into reproducable cases in <a href="https://github.com/elastic/elasticsearch/issues/">GitHub issues</a>, and tackling them one at a time. What else would qualify as a professional commitment if not this one? Again, these were all back in early 2015. Our Elasticsearch production deployments successfully managed to return with a victory from every battle front thrown at them. It did not always feel like a walk in the park. We had our hard times, though managed to overcome those and noted down the experience to the book of lessons learnt. Let me share some common practices from that collection:</p> <ul> <li> <strong>Security</strong>: Elasticsearch does not provide any means of security measures (encryption, etc.) out of the box. We do not use Elasticsearch to store any sort of <a href="https://en.wikipedia.org/wiki/Personally_identifiable_information">PII</a>.</li> <li> <strong>Transactions</strong>: Elasticsearch does not have transaction support. Though we work around it by performing compare-and-swap loops over the <code>_version</code> field.</li> <li> <strong>Tooling</strong>: Elasticsearch tooling is… just a piece of crap. It doesn’t have a proper development environment – you are stuck to running a fully blown Kibana just to be able to use its arcane <a href="https://www.elastic.co/guide/en/kibana/current/console-kibana.html">Console</a>. Its Java client drags in the entire milky way of Elasticsearch artifacts as a dependency which is a <a href="https://en.wikipedia.org/wiki/Java_Classloader#JAR_hell">JAR Hell</a> time bomb waiting to explode. Further, the recently introduced <a href="https://www.elastic.co/guide/en/elasticsearch/client/java-rest/master/java-rest-high.html">high-level REST client</a> leaks the Apache HTTP Client API models, etc. For the leaked models and transitive dependencies, there is nothing much you can do – you just learn to live with them. For IDE, you just keep a thick stack of HTTP request recipes using your favorite HTTP client, e.g., <a href="https://curl.haxx.se/2">cURL</a>, <a href="https://www.getpostman.com/">Postman</a>, <a href="https://httpie.org/">httpie</a>, etc.</li> <li> <strong>Documentation</strong>: Elasticsearch does not have documentation; <a href="https://www.postgresql.org/docs/">PostgreSQL has documentation</a>, <a href="https://docs.mongodb.com/">MongoDB has documentation</a>. What Elasticsearch has is <a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/index.html">a stack of surface-scratching blog posts served in the form of a documentation-like website</a>. Elasticsearch also has an ocean of <a href="https://stackoverflow.com/questions/tagged/elasticsearch">Stack Overflow</a> and <a href="https://discuss.elastic.co/c/elasticsearch">forum</a> posts where you are allowed to swim at your convenience. That being said, one needs to admit that situation is improving over the time. (Yes, it was way worse!)</li> <li> <strong>Resiliency</strong>: Yes, Elasticsearch can crash, just like another piece of software. In order to address these emergencies, in addition to hot-standby clusters, we take regular <a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-snapshots.html">snapshots</a> and persist the messages processed by the ETL pipeline to a separate storage providing efficient write and bulk read operations, e.g., PostgreSQL, Google BigQuery, etc. In case of need, we just restore from a snapshot and replay the necessary set of messages to recover the lost state.</li> </ul> <p>Is Elasticsearch the perfect tool for the job at hand? Not really. But it is the one closest to that. We also know how to deal with each other – just like in any other relationship.</p> <p><a name="new-etl"></a></p> <h1 id="the-new-etl">The New ETL</h1> <p>By taking into account the ETL pipeline concerns detailed in previous chapters, we derived a list of basic foundations that we aim to deliver:</p> <ol> <li>The configuration DSL must be abstract enough to avoid <del>any</del> too much vendor lock-in. One must be able to represent configurations in this DSL such that applying these on a JSON and/or the underlying storage unit must be a matter of writing the necessary adapter classes.</li> <li>The storage must allow the ETL pipeline to query the entire collection using any possible filter combinations allowed by the configuration predicate DSL. This is a crucial pillar in the design to enable real-time processing of every message, both content and configuration snapshot stream, without necessitating an ETL run over the complete collection which used to be the case in the old ETL pipeline.</li> </ol> <p>Let me elaborate on how we addressed these deliverables.</p> <p><a name="primary-storage-elasticsearch"></a></p> <h2 id="the-primary-storage-elasticsearch">The Primary Storage: Elasticsearch</h2> <p>The previous benchmark section already detailed the rationale behind employing Elasticsearch as the primary storage. It is distributed and sharded by default. It doesn’t require explicit indices on a whitelist of allowed configuration predicate fields – every field is allowed to be queried by default. It has no problems with querying fields containing a list of objects. It provides sufficient leverage for data integrity via compare-and-swap loops over <code>_version</code> fields. It is very efficient on bulk fetches and updates, which was totally unexpected for us. Last, but not least, it is our bread and butter in search and we have plenty of experience with it.</p> <p><a name="configuration-dsl-json-groovy"></a></p> <h2 id="the-configuration-dsl-json-and-groovy">The Configuration DSL: JSON and Groovy</h2> <p>In the case of configuration DSL, we wanted to stop the plague of PL/SQL leakage all around the code base. For this purpose, we decided to go with the model depicted below.</p> <p><img src="dsl.jpg" alt="The New Configuration DSL"></p> <p>Here we replaced SQL WHERE clauses, which were used to represent configuration predicates in the old ETL pipeline, with JSON describing the structure of the predicate. This new predicate representation resembling the Elasticsearch filters is translated to individual executors matching against either JSON (coming from the real-time content stream) or the storage engine, that is, Elasticsearch. Note that the way we used to represent the predicate is independent of medium (JSON, Elasticsearch, etc.) it is executed against such that we even implemented a MongoDB adapter at some point. An example configuration predicate JSON is show below:</p> <pre><code class="language-json"><span class="p">{</span> <span class="nt">"type"</span><span class="p">:</span> <span class="s2">"nested"</span><span class="p">,</span> <span class="nt">"path"</span><span class="p">:</span> <span class="p">[</span><span class="s2">"content"</span><span class="p">,</span> <span class="s2">"attribute"</span><span class="p">],</span> <span class="nt">"filter"</span><span class="p">:</span> <span class="p">{</span> <span class="nt">"type"</span><span class="p">:</span> <span class="s2">"and"</span><span class="p">,</span> <span class="nt">"filters"</span><span class="p">:</span> <span class="p">[</span> <span class="p">{</span> <span class="nt">"type"</span><span class="p">:</span> <span class="s2">"nested"</span><span class="p">,</span> <span class="nt">"path"</span><span class="p">:</span> <span class="p">[</span><span class="s2">"gpc"</span><span class="p">],</span> <span class="nt">"filter"</span><span class="p">:</span> <span class="p">{</span> <span class="nt">"type"</span><span class="p">:</span> <span class="s2">"and"</span><span class="p">,</span> <span class="nt">"filters"</span><span class="p">:</span> <span class="p">[</span> <span class="p">{</span> <span class="nt">"type"</span><span class="p">:</span> <span class="s2">"equals"</span><span class="p">,</span> <span class="nt">"path"</span><span class="p">:</span> <span class="p">[</span><span class="s2">"family_id"</span><span class="p">],</span> <span class="nt">"value"</span><span class="p">:</span> <span class="s2">"1234"</span> <span class="p">},</span> <span class="p">{</span> <span class="nt">"type"</span><span class="p">:</span> <span class="s2">"equals"</span><span class="p">,</span> <span class="nt">"path"</span><span class="p">:</span> <span class="p">[</span><span class="s2">"chunk_id"</span><span class="p">],</span> <span class="nt">"value"</span><span class="p">:</span> <span class="s2">"5678"</span> <span class="p">}</span> <span class="p">]</span> <span class="p">}</span> <span class="p">},</span> <span class="p">{</span> <span class="nt">"type"</span><span class="p">:</span> <span class="s2">"nested"</span><span class="p">,</span> <span class="nt">"path"</span><span class="p">:</span> <span class="p">[</span><span class="s2">"authors"</span><span class="p">],</span> <span class="nt">"filter"</span><span class="p">:</span> <span class="p">{</span> <span class="nt">"type"</span><span class="p">:</span> <span class="s2">"and"</span><span class="p">,</span> <span class="nt">"filters"</span><span class="p">:</span> <span class="p">[</span> <span class="p">{</span> <span class="nt">"type"</span><span class="p">:</span> <span class="s2">"equals"</span><span class="p">,</span> <span class="nt">"path"</span><span class="p">:</span> <span class="p">[</span><span class="s2">"fname"</span><span class="p">],</span> <span class="nt">"value"</span><span class="p">:</span> <span class="s2">"Volkan"</span> <span class="p">},</span> <span class="p">{</span> <span class="nt">"type"</span><span class="p">:</span> <span class="s2">"equals"</span><span class="p">,</span> <span class="nt">"path"</span><span class="p">:</span> <span class="p">[</span><span class="s2">"lname"</span><span class="p">],</span> <span class="nt">"value"</span><span class="p">:</span> <span class="s2">"Yazici"</span> <span class="p">}</span> <span class="p">]</span> <span class="p">}</span> <span class="p">}</span> <span class="p">]</span> <span class="p">}</span> <span class="p">}</span></code></pre> <p>As depicted above, we split the configuration mutation model into two abstractions: <em>extension</em> and <em>functional extension</em>. An extension is the simplest form of mutation that generally applies to more than 90% of the available configurations. It is basically a JSON object that is upon execution expected to be merged into the original source. A simple example is as follows:</p> <pre><code class="language-json"><span class="p">{</span> <span class="nt">"category"</span><span class="p">:</span> <span class="s2">"books"</span> <span class="p">}</span></code></pre> <p>Functional extensions are built to address complex configuration mutations. There we employed <a href="http://www.groovy-lang.org/">Groovy</a> after experimenting with some other candidates, e.g., JavaScript (<a href="https://www.oracle.com%0A/technetwork/articles/java/jf14-nashorn-2126515.html">Nashorn</a>, which is <a href="http://openjdk.java.net/jeps/335">planned to be dropped</a>), Python (<a href="http://www.jython.org/">Jython</a>), Ruby (<a href="https://www.jruby.org/">JRuby</a>), etc. The main drivers for us to pick Groovy are as follows:</p> <ul> <li>It supports direct access to Java data structures (e.g., <code>java.util.Map</code>) without any intermediate translations, hence has no problems processing thousands of mutations on a single core.</li> <li>It is widely adopted to an extent that in the future we might opt for running it against the storage engine.</li> <li>Its runtime performance is on par with the rest of the candidates.</li> </ul> <p>That being said, the decision of Groovy creates a JVM vendor lock-in for the ETL pipeline, though we do not anticipate this to be a problem for at least the coming decade.</p> <p>A sample functional extension is given below.</p> <pre><code class="language-groovy"><span class="kd">static</span> <span class="n">Map</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">Object</span><span class="o">&gt;</span> <span class="n">extend</span><span class="o">(</span><span class="n">Map</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">Object</span><span class="o">&gt;</span> <span class="n">source</span><span class="o">)</span> <span class="o">{</span> <span class="kt">def</span> <span class="n">diskCapacityBytes</span> <span class="o">=</span> <span class="o">(</span><span class="kt">long</span><span class="o">)</span> <span class="n">source</span><span class="o">.</span><span class="na">get</span><span class="o">(</span><span class="s2">"disk_capacity_bytes"</span><span class="o">)</span> <span class="kt">def</span> <span class="n">diskCapacityGigabytes</span> <span class="o">=</span> <span class="n">diskCapacityBytes</span> <span class="o">/</span> <span class="mi">1</span><span class="n">e9</span> <span class="n">source</span><span class="o">.</span><span class="na">put</span><span class="o">(</span><span class="s2">"disk_capacity_gigabytes"</span><span class="o">,</span> <span class="n">diskCapacityGigabytes</span><span class="o">)</span> <span class="k">return</span> <span class="n">source</span> <span class="o">}</span></code></pre> <p><a name="conclusion"></a></p> <h1 id="conclusion">Conclusion</h1> <p>Implementing an e-commerce search engine is a tough business. The part of the iceberg under the water level – that is, the ETL pipeline – is not less than that. In this post, I tried to share the lessons we piled up in the implementation and maintenance of our decade-old ETL pipeline and how we cultivated these to come up with something new. I attempted to explain how the choice for the configuration DSL and the used primary storage engine has the uttermost implication on the rest of the components of the architecture. Elasticsearch has already been serving us pretty well in the search gateway. Taking a step further and employing it in the ETL was a substantially unconventional idea that gave the shivers to every engineer involved in the decision. But the careful consideration and evaluation of potential candidates paid off: It worked! So when you visit <a href="https://bol.com">bol.com</a> next time, you will know that the Elasticsearch in the ETL pipeline – in addition to many other Elasticsearch using services involved – cooked that warm page for you seconds ago.</p> <p><a name="acknowledgements"></a></p> <h1 id="acknowledgements">Acknowledgements</h1> <p>I would like thank to <a href="https://twitter.com/bbuharali">Berkay Buharalı</a>, Lourens Heijs, <a href="https://twitter.com/wvl0">William Leese</a>, <a href="https://almer.tigelaar.net/">Almer S. Tigelaar</a>, Leon Widdershoven, and <a href="https://twitter.com/maurice_zeijen">Maurice Zeijen</a> for their valuable feedback in bringing the post to its final form.</p> <p><a name="faq"></a></p> <h1 id="faq">F.A.Q.</h1> <p>Here I will try to answer certain questions I received via <a href="https://news.ycombinator.com/item?id=18568922">Hackernews</a> or e-mail.</p> <h2 id="did-you-try-tuning-the-postgresql-optimization-knobs">Did you try tuning the PostgreSQL optimization knobs?</h2> <p>bol.com has plenty of databases supported by an army of skilled DBAs. During benchmarks, we collaborated with our PostgreSQL experts to continuosly tune the necessary knobs to get the best performance given our data size and access patterns. Hence, it wasn’t a tune once, run once operation, but rather a continuous effort to determine an optimal configuration.</p> <h2 id="how-do-you-calculate-searchrank">How do you calculate <code>search_rank</code>?</h2> <p>For the benchmark purposes, we employed a deprecated signal (that is, <code>search_rank</code>) that we used to score the matched documents. In the new search gateway, that approach is replaced with a multitude of context-dependent signals combined at runtime. The answer to how does the computation of ranking signals work is out of the scope of this post. But in a nutshell, it is an in-house machine learning algorithm harvesting historical user interaction log. Handling of sudden or seasonal trends? That is a whole different game.</p> tag:vlkan.com,2018-02-17://blog/post/2018/02/17/varnishing-search-performance/ Varnishing Search Performance 2018-02-17T20:42:00Z 2018-02-17T20:42:00Z <p>This week <a href="http://bol.com">bol.com</a> hosted an <a href="https://www.meetup.com/Elastic-NL/">Elastic User Group NL</a> meetup titled <a href="https://www.meetup.com/Elastic-NL/events/247114723/">bol.com: Changing the (search) engine of a racecar going 300 km/h</a>. The abstract of the presentations were as follows:</p> <blockquote> <p>Almost 2 years ago bol.com decided to move towards an Elasticsearch-powered search engine. But how do you approach such a project? Who do you involve and what do you need to (not) do? The engineers at bol.com would like to share their experiences about this migration, in 4 short talks.</p> </blockquote> <p>And among those 4 short talks, I took the stage with <em>Varnishing Search Performance</em>.</p> <blockquote> <p>Searching is <em>peanuts</em>. You setup your Elasticsearch cluster (or better find a SaaS partner) and start shooting your search queries against it. Well… Not really. If we put the biblical data ingestion story aside, it won’t take long to realize that even moderately complicated queries can become a bottleneck for those aiming for &lt;50ms query performance. Combine a couple of aggregations, double that for facets of range type, add your grandpa’s boosting factors to the scoring, and there you go; now you are a search query performance bottleneck owner too! Maybe I am exaggerating a bit. Why not just start throwing some caches in front of it? Hrm… We actually thought of that and did so. Though it brought a mountain of problems along with itself, and there goes my story.</p> </blockquote> <p>The slides are available in <a href="varnishing-search-performance.pdf">PDF</a> and <a href="varnishing-search-performance-org.odp">ODP</a> formats.</p> <iframe src="//www.slideshare.net/slideshow/embed_code/key/4h5JWHH25nHGa4" width="476" height="400" frameborder="0" marginwidth="0" marginheight="0" scrolling="no"> </iframe> tag:vlkan.com,2018-02-09://blog/post/2018/02/09/netty-in-action/ Notes on "Netty in Action" 2018-02-09T20:34:00Z 2018-02-09T20:34:00Z <p>Those who had priviledge to read my <a href="/blog/post/2017/04/18/inter-service-comm/">frustration chronicles on intra-microservice communication</a> would easily recall me pointing my finger to Java Platform SE guys for not shipping a proper HTTP client. There my fury went to an extent calling it as one of the closest candidates for the billion dollar mistake. Unfortunately screaming out loud in a blog post does not give much of a relief, because it doesn’t take more than a month for me to find myself in precisely the same technical mudpot. Indeed after a couple of months later I wrote that post, I was chasing yet another performance problem in one of our aggregation services. In essence, each incoming HTTP request is served by aggregating multiple sources collected again over HTTP. This simple fairy tale architecture gets slaughtered on production by 200 Tomcat threads intertwined with Rx computation and I/O threads resting in the shades of a dozen other thread pools dedicated for so-called-asynchronous HTTP clients for aggregated remote services. And I saved the best for last: there were leaking <code>TIME_WAIT</code> sockets.</p> <p>All of a sudden the question occurred to me like the roar of rolling boulders down a steep hill in a far distance: What is the lowest level that I can plumb a networking application in Java without dealing with protocol intricacies. Put another way, is there a foundational abstraction exposing both the lowest (channel with I/O streams) and highest (HTTP headers and body) levels that are in reach? I rode both Java OIO and NIO (that is, old- and new-I/O) horses in the past and fell off enough to learn it the hard way that they are definitely not feasible options in this case. The first attempt in the search of a cure in Google introduces you to <a href="http://netty.io/">Netty</a>. If you dig long enough, you also stumble upon <a href="http://mina.apache.org/">Apache Mina</a> too. Netty is popular enough in the Java world that it is highly likely you are an indirect consumer of it, unless you are already directly using it. I was aware of its presence like dark matter in every single network application that I wrote, though I have never considered to use it directly. Checking the Netty website after dealing with crippled network applications at hand revealed an enlightenment within me: <em>Hey! I can purpose this to implement some sort of RPC mechanism using Protocol Buffers in HTTP 2.0 request payloads!</em> Though further investigation swipes the dust from the footsteps of giants who had followed the same path: Google (<a href="https://grpc.io/">gRPC</a>), Facebook (<a href="https://github.com/facebook/nifty">Nifty</a>), Twitter (<a href="https://twitter.github.io/finagle/">Finagle</a>), etc. This finding while crushing my first excitement, later on left its place to the confidence of getting confirmed that I am on the right path.</p> <p>I have always heard good things about both Netty and its community. I have already been sneakily following the <a href="http://normanmaurer.me/presentations/">presentations</a> and <a href="https://twitter.com/normanmaurer">Twitter updates</a> of <a href="http://normanmaurer.me/">Norman Maurer</a>, the Netty shepherd as of date. Though what triggered me for diving deep with Netty has become the following tweet:</p> <blockquote class="twitter-tweet" data-lang="en"> <p lang="en" dir="ltr">Challenge accepted! First step is done. Next: Cover to cover study. <a href="https://t.co/Gnfhbi6Ko0">pic.twitter.com/Gnfhbi6Ko0</a></p>— Volkan Yazıcı (@yazicivo) <a href="https://twitter.com/yazicivo/status/954366672751689728?ref_src=twsrc%5Etfw">January 19, 2018</a> </blockquote> <p>Norman Maurer has always been kind and encouraging to new contributors. So my plan is to turn this into a relation with mutual benefit: I can contribute and get tutored while doing that so.</p> <h1 id="netty-in-action">Netty in Action</h1> <p><a href="https://www.manning.com/books/netty-in-action">The book</a> (2016 press date) is definitely a must read for anyone planning to use Netty. It lays out Netty fundamentals like channels, handlers, encoders, etc. in detail. That being said, I have got the impression that the content is mostly curated for beginners. For instance, dozens of pages (and an appendix) are spent (wasted?) for a Maven crash course, not to mention the space wasted by Maven command ouputs shared. This felt a little bit disappointing considering the existing audience of Netty in general. Who would really read a book about Netty? You have probably had your time with OIO/NIO primitives or client/server frameworks in the market. You certainly don’t want to use yet another library that promises to make all your problems disappear. So I don’t think you can be qualified as a novice in this battle anymore, and you are indeed in the search of a scalpel rather than a swiss army knife. Nevertheless, I still think the book eventually managed to succeed in finding a balance between going too deep and just scratching the surface.</p> <h2 id="things-that-are-well-done">Things that are well done</h2> <ul> <li> <p>I really enjoyed the presented <strong>historical perspective</strong> on the development of Java platforms’ networking facilities and Netty itself. Found it quite valuable and wanted to read more and more!</p> </li> <li> <p>Emphasis on <strong><code>ByteBuf</code></strong> was really handy. Later on I learnt that there are people using Netty just for its sound <code>ByteBuf</code> implementation.</p> </li> <li> <p>Almost every single conscious decision within the shared <strong>code snippets are explained in detail</strong>. While this felt like quite some noise in the beginning, later on it turned out be really helpful – especially while manually updating <code>ByteBuf</code> reference counts.</p> </li> <li> <p>Presented <strong>case studies</strong> were quite interesting to read and inspiring too.</p> </li> </ul> <h2 id="things-that-could-have-been-improved">Things that could have been improved</h2> <ul> <li> <p>I had big hopes to read about how to implement an HTTP client with <strong>connection pool</strong> support. I particularly find this feature inevitable in a networking application and often not consumed wisely. Though there wasn’t a single section mentioning about connection pooling of any sort.</p> </li> <li> <p>As someone who had studied <a href="http://normanmaurer.me/presentations/">Norman Maurer’s presentations</a>, I was expecting to see waaaay more practical tips about <strong>GC considerations</strong>, updating <strong>socket options</strong> (<code>TCP_NO_DELAY</code>, <code>SO_SNDBUF</code>, <code>SO_RCVBUF</code>, <code>SO_BACKLOG</code>, etc.), mitigating <strong><code>TIME_WAIT</code></strong> socket problems, and Netty best practices. Maybe adding this content would have doubled the size of the book, though I still think a book on Netty is incomplete without such practical tips.</p> </li> <li> <p>Many inbound requests trigger multiple I/O operations in a typical network application. It is crucial to not let these operatins block a running thread, which Netty is well aware of and hence ships a fully-fledged <code>EventExecutor</code> abstraction. This crucial detail is mentioned in many places within the book, though none gave a concrete example. Such a common thing could have been demonstrated by an example.</p> </li> </ul> <h1 id="notes">Notes</h1> <p>I always take notes while reading a book. Let it be a grammar mistake, code typo, incorrect or ambiguous information, thought provoking know-how, practical tip, etc. You name it. Here I will share them in page order. I will further classify my notes in 3 groups: <span class="note-mistake">mistakes</span>, <span class="note-improvement">improvements</span>, <span class="note-question">questions</span>, and <span class="note-other">other</span>.</p> <ul> <li> <p><span class="note-question">[p19, Listing 2.1]</span> Why did we use <code>ctx.writeAndFlush(Unpooled.EMPTY_BUFFER)</code> rather than just calling <code>ctx.flush()</code>?</p> </li> <li> <p><span class="note-mistake">[p21, Listing 2.2]</span> Typo in <code>throws Exceptio3n</code>.</p> </li> <li> <p><span class="note-improvement">[p49, Section 4.3.1]</span> The listed items</p> <blockquote> <ul> <li>A new <code>Channel</code> was accepted and is ready.</li> <li>A <code>Channel</code> connection …</li> </ul> </blockquote> <p>are an identical repetition of Table 4.3.</p> </li> <li> <p><span class="note-improvement">[p60]</span> <code>CompositeByteBuf</code> has the following remark:</p> <blockquote> <p>Note that Netty optimizes socket I/O operations that employ <code>CompositeByteBuf</code>, eliminating whenever possible the performance and memory usage penalties that are incurred with JDK’s buffer implementation. This optimization takes place in Netty’s core code and is therefore not exposed, but you should be aware of its impact.</p> </blockquote> <p>Interesting. Good to know. I should be aware of <em>its impact</em>. But how can I measure and relate this impact? Maybe I am just nitpicking, tough I would love to hear a little bit more.</p> </li> <li> <p><span class="note-question">[p77, Table 6.3]</span> <code>channelWritabilityChanged()</code> method of <code>ChannelInboundHandler</code>… How come an inbound channel can have a writability notion? I would have expected an inbound channel to be just readable.</p> </li> <li> <p><span class="note-improvement">[p78, Section 6.1.4]</span> Starts with some really intriguing paragraph:</p> <blockquote> <p>A powerful capability of <code>ChannelOutboundHandler</code> is to defer an operation or event on demand, which allows for sophisticated approaches to request handling. If writing to the remote peer is suspended, for example, you can defer flush operations and resume them later.</p> </blockquote> <p>Though it ends here. No more explanations, not even a single example, etc. A total mystery.</p> </li> <li> <p><span class="note-question">[p79, Table 6.4]</span> <code>read()</code> method of a <code>ChannelOutboundHandler</code>… Similar to <code>ChannelInboundHandler#channelWritabilityChanged()</code>, how come an outbound channel can have a read method? What are we reading that is supposed to be already originating from us and destined to a remote peer?</p> </li> <li> <p><span class="note-improvement">[p79, Section 6.1.4]</span> It goes as follows:</p> <blockquote> <p><strong><code>ChannelPromise</code> vs. <code>ChannelFuture</code></strong> Most of the methods in <code>ChannelOutboutHandler</code> take a <code>ChannelPromise</code> argument to be notified when the operation completes. <code>ChannelPromise</code> is a subinterface of <code>ChannelFuture</code> that defines the writable methods, such as <code>setSuccess()</code> or <code>setFailure()</code>, thus making <code>ChannelFuture</code> immutable.</p> </blockquote> <p>Ok, but why? I know the difference between a <code>Future</code> and a <code>Promise</code>, though I still cannot see the necessity for outbound handlers to employ <code>Promise</code> instead of a <code>Future</code>.</p> </li> <li> <p><span class="note-question">[p84, Listing 6.5]</span> While adding handlers to a pipeline, what happens in the case of a name conflict?</p> </li> <li> <p><span class="note-improvement">[p84]</span> A remark is dropped on the <strong><code>ChannelHandler</code> execution and blocking</strong> subject. Just in time! Though it misses a demonstration.</p> </li> <li> <p><span class="note-question">[p86, Listing 6.9]</span> Again a <code>read()</code> method for the outbound operations of a <code>ChannelPipeline</code>. I am really puzzled on the notion of reading from an outbound channel.</p> </li> <li> <p><span class="note-question">[p94, Listing 6.13]</span> What happens when a <code>ChannelFuture</code> completes before adding a listener to it?</p> </li> <li> <p><span class="note-mistake">[p95, Section 6.5]</span> Last paragraph goes like this:</p> <blockquote> <p>The next chapter will focus on Netty’s codec abstraction, which makes writing protocol encoders and decoders much easier than using the underlying <code>ChannelHandler</code> implementations directly.</p> </blockquote> <p>Though next chapter focuses on <code>EventLoop</code> and threading model.</p> </li> <li> <p><span class="note-question">[p102, Listing 7.3]</span> Speaking of scheduling <code>Runnable</code>s to a channel’s event loop, what if channel gets closed before triggering the scheduled tasks?</p> </li> <li> <p><span class="note-improvement">[p103]</span> Page starts with the following last paragraph:</p> <blockquote> <p>These examples illustrate the performance gain that can be achieved by taking advantage of Netty’s scheduling capabilities.</p> </blockquote> <p>Really? Netty’s scheduling capabilities are shown by using each function in isolation. Though I still don’t have a clue on how these capabilities can be purposed for a performance gain. This is a <strong>common problem throughout the book</strong>: The innocent flashy statement hangs in the air, waiting for a demonstration that shares some insight distilled by experience.</p> </li> <li> <p><span class="note-mistake">[p104, Figure 7.4]</span> The caption of figure is as follows:</p> <blockquote> <p><code>EventLoop</code> allocation for non-blocking transports (such as NIO and AIO)</p> </blockquote> <p>AIO? Looks like a typo.</p> </li> <li> <p><span class="note-mistake">[p107]</span> Chapter starts with the following opening paragraph:</p> <blockquote> <p>Having studied <code>ChannelPipeline</code>s, <code>ChannelHandler</code>s, and codec classes in depth, …</p> </blockquote> <p>Nope. Nothing has been mentioned so far about codec classes.</p> </li> <li> <p><span class="note-improvement">[p112]</span> It is explained that, in the context of <code>Bootstrap</code>, <code>bind()</code> and <code>connect()</code> can throw <code>IllegalStateException</code> if some combination of <code>group()</code>, <code>channel()</code>, <code>channelHandler()</code>, and/or <code>handler()</code> method calls is missing. Similarly, calling <code>attr()</code> after <code>bind()</code> has no effect. I personally find such abstractions poorly designed. I would rather have used the <a href="https://immutables.github.io/immutable.html#staged-builder">staged builder pattern</a> and avoid such intricacies at compile-time.</p> </li> <li> <p><span class="note-mistake">[p117, Listing 8.6]</span> The 2nd argument to <code>Bootstrap#group()</code> looks like a typo.</p> </li> <li> <p><span class="note-improvement">[p120]</span> Check this end of chapter summary out:</p> <blockquote> <p>In this chapter you learned how to bootstrap Netty server and client applications, including those that use connectionless protocols. We covered a number of special cases, including bootstrapping client channels in server applications and using a <code>ChannelInitializer</code> to handle the installation of multiple <code>ChannelHandler</code>s during bootstrapping. You saw how to specify configuration options on channels and how to attach information to a channel using attributes. Finally, you learned how to shut down an application gracefully to release all resources in an orderly fashion.</p> <p>In the next chapter we’ll examine the tools Netty provides to help you test your <code>ChannelHandler</code> implementations.</p> </blockquote> <p>I have always found such summaries useless, since it is a repetition of the chapter introduction, and hence a waste of space. Rather just give crucial take aways, preferably in a digestible at a glimpse form. For instance, <em>use <code>EventLoopGroup.shutdownGracefully()</code></em>, etc.</p> </li> <li> <p><span class="note-improvement">[p121]</span> I suppose <em>Unit Testing</em> chapter used to come after <em>Codecs</em> in previous prints and the authors have moved it to an earlier stage to establish a certain coherence in the introductory chapters. Though, reading <em>Codecs</em> reveals that there is close to 70% overlap in content, which feels like a poorly structured flow. I see the value in authors’ attempt, though there is quite some room for improvement via tuning the break down of chapters.</p> </li> <li> <p><span class="note-mistake">[p124, Section 9.2.1]</span> <code>ByteToMessageDecoder</code> is used before explained. (See my remark above.)</p> </li> <li> <p><span class="note-improvement">[p127]</span> The following bullets</p> <blockquote> <p>Here are the steps executed in the code:</p> <ol> <li>Writes negative 4-byte integers to a new <code>ByteBuf</code>.</li> <li>Creates an <code>EmbeddedChannel</code> …</li> </ol> </blockquote> <p>is a repetition of the descriptions available in Listing 9.4.</p> </li> <li> <p><span class="note-mistake">[p138, Listing 10.3]</span> Comma missing after <code>Integer msg</code>.</p> </li> <li> <p><span class="note-question">[p141]</span> Why do <code>MessageToMessage{Encoder,Decoder}</code> classes do not have an output type, but just <code>Object</code>? How do you ensure type safety while chaining them along a pipeline?</p> </li> <li> <p><span class="note-mistake">[p142, Listing 10.6]</span> Comma missing after <code>Integer msg</code>.</p> </li> <li> <p><span class="note-mistake">[p145, Listing 10.7]</span> Constructor of <code>MyWebSocketFrame</code> is named incorrectly.</p> </li> <li> <p><span class="note-improvement">[p151, Section 11.2]</span> I think <em>Building Netty HTTP/HTTPS applications</em> deserves its own chapter. And a very important subject is missing: connection pooling.</p> </li> <li> <p><span class="note-question">[p157, Listing 11.6]</span> While building the WebSocket pipeline, which handler addresses ping/pong frames?</p> </li> <li> <p><span class="note-mistake">[p159, Table 11.4]</span> The first sentence in the description of <code>WriteTimeoutHandler</code> is identical to the one in <code>ReadTimeoutHandler</code>. Supposedly a copy-paste side-effect.</p> </li> <li> <p><span class="note-mistake">[p171]</span> Check out the first paragraph:</p> <blockquote> <p>WebSocket is an advanced network protocol that has been developed to improve the performance and responsiveness of web applications. We’ll explore Netty’s support for <em>each of them</em> by writing a sample application.</p> </blockquote> <p>Each of them? Who are they?</p> </li> <li> <p><span class="note-mistake">[p177]</span> <em>The call to <code>retain()</code> is needed because after <code>channelRead()</code> …</em> → <em>The call to <code>retain()</code> is needed because after <code>channelRead0()</code> …</em></p> </li> <li> <p><span class="note-improvement">[p178, Table 12.1]</span> Identical to Table 11.3.</p> </li> <li> <p><span class="note-mistake">[p181, Figure 12.3]</span> <code>ChunkedWriteHandler</code> is missing.</p> </li> <li> <p><span class="note-question">[p183, Listing 12.4]</span> There the shutdown of the chat server is realized via <code>Runtime.getRuntime().addShutdownHook()</code>. Is this a recommended practice?</p> </li> <li> <p><span class="note-mistake">[p189]</span> <em>Figure 14.1 presents a high-level view of the …</em> → <em>Figure 13.1</em></p> </li> <li> <p><span class="note-mistake">[p189]</span> <em>Listing 14.1 shows the details of this simple POJO.</em> → <em>Listing 13.1</em></p> </li> <li> <p><span class="note-improvement">[p190, Listing 13.1]</span> <code>received</code> field is not used at all. Could be removed to increase clarity. Interestingly, the field is not even encoded.</p> </li> <li> <p><span class="note-mistake">[p191, Table 13.1]</span> <code>extendsDefaultAddressedEnvelope</code> → <code>extends DefaultAddressedEnvelope</code></p> </li> <li> <p><span class="note-mistake">[p191]</span> <em>Figure 14.2 shows the broadcasting of three log …</em> → <em>Figure 13.2</em></p> </li> <li> <p><span class="note-mistake">[p192]</span> <em>Figure 14.3 represents a high-level view of the …</em> → <em>Figure 13.3</em></p> </li> <li> <p><span class="note-improvement">[p192, Listing 13.2]</span> A <code>byte[] file</code> and <code>byte[] msg</code> pair is encoded as follows:</p> <pre><code class="language-java"><span class="n">buf</span><span class="o">.</span><span class="na">writeBytes</span><span class="o">(</span><span class="n">file</span><span class="o">);</span> <span class="n">buf</span><span class="o">.</span><span class="na">writeBytes</span><span class="o">(</span><span class="n">LogEvent</span><span class="o">.</span><span class="na">SEPARATOR</span><span class="o">);</span> <span class="n">buf</span><span class="o">.</span><span class="na">writeBytes</span><span class="o">(</span><span class="n">msg</span><span class="o">);</span></code></pre> <p>Later on each entry is read back by splitting at <code>LogEvent.SEPARATOR</code>. What if <code>file</code> contains <code>LogEvent.SEPARATOR</code>? I think this is a bad encoding practice. I would rather do:</p> <pre><code class="language-java"><span class="n">buf</span><span class="o">.</span><span class="na">writeInt</span><span class="o">(</span><span class="n">file</span><span class="o">.</span><span class="na">length</span><span class="o">);</span> <span class="n">buf</span><span class="o">.</span><span class="na">writeBytes</span><span class="o">(</span><span class="n">file</span><span class="o">);</span> <span class="n">buf</span><span class="o">.</span><span class="na">writeInt</span><span class="o">(</span><span class="n">msg</span><span class="o">.</span><span class="na">length</span><span class="o">);</span> <span class="n">buf</span><span class="o">.</span><span class="na">writeBytes</span><span class="o">(</span><span class="n">msg</span><span class="o">);</span></code></pre> </li> <li> <p><span class="note-question">[p194, Listing 13.3]</span> Is there a constant for <code>255.255.255.255</code> broadcast address?</p> </li> <li> <p><span class="note-mistake">[p195]</span> <em>Figure 14.4 depicts the <code>ChannelPipeline</code> of the <code>LogEventonitor</code> …</em> → <em>Figure 13.4</em></p> </li> <li> <p><span class="note-improvement">[p196]</span> Check this out:</p> <blockquote> <p>The <code>LogEventHandler</code> prints the <code>LogEvent</code>s in an easy-to-read format that consists of the following:</p> <ul> <li>The received timestamp in milliseconds.</li> </ul> </blockquote> <p>Really? I did not know epoch timestamps were <em>easy-to-read</em>. Maybe for some definition of easy-to-read.</p> </li> <li> <p><span class="note-mistake">[p195]</span> <em>Now we need to install our handlers in the <code>ChannelPipeline</code>, as seen in figure 14.4.</em> → <em>Figure 13.4</em></p> </li> <li> <p><span class="note-mistake">[p205]</span> <em>Approach A, optimistic and apparently simpler (figure 15.1)</em> → <em>figure 14.1</em></p> </li> <li> <p><span class="note-improvement">[p206]</span> Half of the page is spent for justifying Droplr’s preference of approach B (safe and complex) over approach A (optimistic and simpler). Call me an idiot, but I am not sold to these arguments that the former approach is less safe.</p> </li> <li> <p><span class="note-mistake">[p207]</span> Type of <code>pipelineFactory</code> is missing.</p> </li> <li> <p><span class="note-improvement">[p210]</span> There is a bullet for tuning JVM. This on its own could have been a really interesting chapter of this book.</p> </li> <li> <p><span class="note-other">[p213]</span> Firebase is indeed implementing TCP-over-long-polling. I wonder if there exists any Java libraries that implements user-level TCP over a certain channel abstraction.</p> </li> <li> <p><span class="note-mistake">[p214]</span> <em>Figure 15.4 demonstrates how the Firebase long-polling …</em> → <em>Figure 14.4</em></p> </li> <li> <p><span class="note-mistake">[p215]</span> <em>Figure 15.5 illustrates how Netty lets Firebase respond to …</em> → <em>Figure 14.5</em></p> </li> <li> <p><span class="note-mistake">[p216]</span> <em>… can start as soon as byes come in off the wire.</em> → <em>bytes</em></p> </li> <li> <p><span class="note-mistake">[p217, Listing 14.3]</span> Last parenthesis is missing:</p> <pre><code class="language-scala"><span class="n">rxBytes</span> <span class="o">+=</span> <span class="n">buf</span><span class="o">.</span><span class="n">readableBytes</span><span class="o">(</span> <span class="n">tryFlush</span><span class="o">(</span><span class="n">ctx</span><span class="o">)</span></code></pre> </li> <li> <p><span class="note-improvement">[p217, Listing 14.3]</span> 70% of the intro was about implementing a control flow over long polling, though the shared code snippet is about totally something else and almost irrelevant.</p> </li> <li> <p><span class="note-mistake">[p223]</span> <em>In referring to figure 15.1, note that two paths …</em> → <em>figure 14.6</em></p> </li> <li> <p><span class="note-mistake">[p229]</span> <em>This request/execution flow is shown in figure 16.1.</em> → <em>figure 15.1</em></p> </li> <li> <p><span class="note-mistake">[p230]</span> <em>Figure 16.2 shows how pipelined requests are handled …</em> → <em>Figure 15.2</em></p> </li> <li> <p><span class="note-mistake">[p230]</span> <em>…, in the required order. See figure 16.3.</em> → <em>figure 15.3</em></p> </li> <li> <p><span class="note-mistake">[p232]</span> <em>That simple flow (show in figure 16.4) works…</em> → <em>figure 15.4</em></p> </li> <li> <p><span class="note-improvement">[p232]</span> <em>The client call is dispatched to the Swift library, …</em> What is Swift library? Was not explained anywhere.</p> </li> <li> <p><span class="note-mistake">[p232]</span> <em>This is the flow shown in figure 16.5.</em> → <em>figure 15.5</em></p> </li> <li> <p><span class="note-other">[p234]</span> This is a really interesting piece:</p> <blockquote> <p>Before <a href="https://github.com/facebook/nifty">Nifty</a>, many of our major Java services at Facebook used an older, custom NIO-based Thrift server implementation that works similarly to Nifty. That implementation is an older codebase that had more time to mature, but because its asynchronous I/O handling code was built from scratch, and because Nifty is built on the solid foundation of Netty’s asynchronous I/O framework, it has had many fewer problems.</p> <p>One of our custom message queuing services had been built using the older framework, and it started to suffer from a kind of socket leak. A lot of connections were sitting around in <code>CLOSE_WAIT</code> state, meaning the server had received a notification that the client had closed the socket, but the server never reciprocated by making its own call to close the socket. This left the sockets in a kind of <code>CLOSE_WAIT</code> limbo.</p> <p>The problem happened very slowly; across the entire pool of machines handling this service, there might be millions of requests per second, but usually only one socket on one server would enter this state in an hour. It wasn’t an urgent issue because it took a long time before a server needed a restart at that rate, but it also complicated tracking down the cause. Extensive digging through the code didn’t help much either: initially several places looked suspicious, but everything ultimately checked out and we didn’t locate the problem.</p> </blockquote> </li> <li> <p><span class="note-mistake">[p238]</span> <em>Figure 16.6 shows the relationship between …</em> → <em>figure 15.6</em></p> </li> <li> <p><span class="note-improvement">[p239, Listing 15.2]</span> All presented Scala code in this chapter is over-complicated and the complexity does not serve any purpose except wasting space and increasing cognitive load. For instance, why does <code>ChannelConnector</code> extend <code>(SocketAddress =&gt; Future[Transport[In, Out]])</code> rather than just being a simple method?</p> </li> <li> <p><span class="note-improvement">[p239]</span> <em>This factory is provided a <code>ChannelPipelineFactory</code>, which is …</em> What is <em>this factory</em>?</p> </li> </ul> <style type="text/css"> span.note-mistake { color: red; } span.note-improvement { color: orange; } span.note-question { color: green; } span.note-other { color: silver; } </style> <h1 id="conclusion">Conclusion</h1> <p>In summary, <a href="https://www.manning.com/books/netty-in-action">Netty in Action</a> is a book that I would recommend to everyone who wants to learn more about Netty to use it in their applications. Almost the entire set of fundamental Netty abstractions are covered in detail. The content is a bliss for novice users in networking domain. Though this in return might make the book uninteresting for people who already got their hands pretty dirty with networking facilities available in Java Platform. That being said, the presented historical perspective and shared case studies are still pretty attractive even for the most advanced users.</p> <p>I don’t know much about the 2<sup>nd</sup> author of the book, Marvin Allen Wolfthal. Though, the 1<sup>st</sup> author, Norman Maurer, is a pretty known figure in the F/OSS ecosystem. If he manages to transfer more juice from his experience and presentations to the book, I will definitely buy the 2<sup>nd</sup> print of the book too!</p> tag:vlkan.com,2017-04-18://blog/post/2017/10/20/hazelcast-guice/ Guice Integration in Hazelcast 2017-04-18T17:22:00Z 2017-04-18T17:22:00Z <p>For many occassions I find the distributed <code>ExecutorService</code> of Hazelcast (aka. <code>IExecutorService</code>) pretty convenient to turn a set of nodes into a tamed cluster waiting for orders. You just submit an either <code>Runnable</code> or <code>Callable&lt;T&gt;</code> and Hazelcast takes care of the rest – executing the task on remote members, acknowledging the response(s) back, etc. Though note that since the method and its response will be delivered over the wire, it is no surprise that they all need to be <code>Serializable</code>.</p> <pre><code class="language-java"><span class="kn">import</span> <span class="nn">com.hazelcast.core.Hazelcast</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">com.hazelcast.core.HazelcastInstance</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">com.hazelcast.core.IExecutorService</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">com.hazelcast.core.Member</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">com.hazelcast.core.MultiExecutionCallback</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">java.io.Serializable</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">java.util.Map</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">java.util.concurrent.Callable</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">java.util.concurrent.CompletableFuture</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">java.util.concurrent.TimeUnit</span><span class="o">;</span> <span class="kd">public</span> <span class="kd">enum</span> <span class="n">HzGuiceDemo</span> <span class="o">{;</span> <span class="kd">public</span> <span class="kd">static</span> <span class="kd">class</span> <span class="nc">ProcessorCountTask</span> <span class="kd">implements</span> <span class="n">Serializable</span><span class="o">,</span> <span class="n">Callable</span><span class="o">&lt;</span><span class="n">Integer</span><span class="o">&gt;</span> <span class="o">{</span> <span class="nd">@Override</span> <span class="kd">public</span> <span class="n">Integer</span> <span class="nf">call</span><span class="o">()</span> <span class="o">{</span> <span class="k">return</span> <span class="n">Runtime</span><span class="o">.</span><span class="na">getRuntime</span><span class="o">().</span><span class="na">availableProcessors</span><span class="o">();</span> <span class="o">}</span> <span class="o">}</span> <span class="kd">public</span> <span class="kd">static</span> <span class="kt">void</span> <span class="nf">main</span><span class="o">(</span><span class="n">String</span><span class="o">[]</span> <span class="n">args</span><span class="o">)</span> <span class="kd">throws</span> <span class="n">Throwable</span> <span class="o">{</span> <span class="n">HazelcastInstance</span> <span class="n">hzInstance</span> <span class="o">=</span> <span class="n">Hazelcast</span><span class="o">.</span><span class="na">newHazelcastInstance</span><span class="o">();</span> <span class="n">IExecutorService</span> <span class="n">hzExecutorService</span> <span class="o">=</span> <span class="n">hzInstance</span><span class="o">.</span><span class="na">getExecutorService</span><span class="o">(</span><span class="s">"ballpark"</span><span class="o">);</span> <span class="n">CompletableFuture</span><span class="o">&lt;</span><span class="n">Integer</span><span class="o">&gt;</span> <span class="n">totalProcessorCountFuture</span> <span class="o">=</span> <span class="k">new</span> <span class="n">CompletableFuture</span><span class="o">&lt;&gt;();</span> <span class="n">hzExecutorService</span><span class="o">.</span><span class="na">submitToAllMembers</span><span class="o">(</span> <span class="k">new</span> <span class="nf">ProcessorCountTask</span><span class="o">(),</span> <span class="k">new</span> <span class="nf">MultiExecutionCallback</span><span class="o">()</span> <span class="o">{</span> <span class="nd">@Override</span> <span class="kd">public</span> <span class="kt">void</span> <span class="nf">onResponse</span><span class="o">(</span><span class="n">Member</span> <span class="n">member</span><span class="o">,</span> <span class="n">Object</span> <span class="n">value</span><span class="o">)</span> <span class="o">{</span> <span class="c1">// Ignored.</span> <span class="o">}</span> <span class="nd">@Override</span> <span class="kd">public</span> <span class="kt">void</span> <span class="nf">onComplete</span><span class="o">(</span><span class="n">Map</span><span class="o">&lt;</span><span class="n">Member</span><span class="o">,</span> <span class="n">Object</span><span class="o">&gt;</span> <span class="n">values</span><span class="o">)</span> <span class="o">{</span> <span class="kt">int</span> <span class="n">totalProcessorCount</span> <span class="o">=</span> <span class="n">values</span> <span class="o">.</span><span class="na">values</span><span class="o">()</span> <span class="o">.</span><span class="na">stream</span><span class="o">()</span> <span class="o">.</span><span class="na">mapToInt</span><span class="o">(</span><span class="n">object</span> <span class="o">-&gt;</span> <span class="o">(</span><span class="kt">int</span><span class="o">)</span> <span class="n">object</span><span class="o">)</span> <span class="o">.</span><span class="na">sum</span><span class="o">();</span> <span class="n">totalProcessorCountFuture</span><span class="o">.</span><span class="na">complete</span><span class="o">(</span><span class="n">totalProcessorCount</span><span class="o">);</span> <span class="o">}</span> <span class="o">});</span> <span class="kt">int</span> <span class="n">totalProcessorCount</span> <span class="o">=</span> <span class="n">totalProcessorCountFuture</span><span class="o">.</span><span class="na">get</span><span class="o">(</span><span class="mi">10</span><span class="o">,</span> <span class="n">TimeUnit</span><span class="o">.</span><span class="na">SECONDS</span><span class="o">);</span> <span class="n">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">format</span><span class="o">(</span><span class="s">"there are %d processors in total%n"</span><span class="o">,</span> <span class="n">totalProcessorCount</span><span class="o">);</span> <span class="n">hzInstance</span><span class="o">.</span><span class="na">shutdown</span><span class="o">();</span> <span class="o">}</span> <span class="o">}</span></code></pre> <p>Unfortunately many of our tasks are not isolated from the rest of the application state (i.e., <em>stateless</em>) as <code>ProcessorCountTask</code> given above. Most of the time the functional requirements necessitate access to the remote node state that is available through beans provided by the underlying dependency injection framework. Consider the following stateful <code>PizzaService</code> that is responsible for cooking pizzas to its users.</p> <pre><code class="language-java"><span class="kn">import</span> <span class="nn">javax.inject.Singleton</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">static</span> <span class="n">com</span><span class="o">.</span><span class="na">google</span><span class="o">.</span><span class="na">common</span><span class="o">.</span><span class="na">base</span><span class="o">.</span><span class="na">Preconditions</span><span class="o">.</span><span class="na">checkArgument</span><span class="o">;</span> <span class="nd">@Singleton</span> <span class="kd">public</span> <span class="kd">static</span> <span class="kd">class</span> <span class="nc">PizzaService</span> <span class="o">{</span> <span class="kd">private</span> <span class="kd">volatile</span> <span class="kt">int</span> <span class="n">totalPizzaCount</span> <span class="o">=</span> <span class="mi">0</span><span class="o">;</span> <span class="kd">public</span> <span class="kd">synchronized</span> <span class="kt">int</span> <span class="nf">cook</span><span class="o">(</span><span class="kt">int</span> <span class="n">amount</span><span class="o">)</span> <span class="o">{</span> <span class="n">checkArgument</span><span class="o">(</span><span class="n">amount</span> <span class="o">&gt;</span> <span class="mi">0</span><span class="o">,</span> <span class="s">"expecting: amount &gt; 0, found: %s"</span><span class="o">,</span> <span class="n">amount</span><span class="o">);</span> <span class="n">availablePizzaCount</span> <span class="o">+=</span> <span class="n">amount</span><span class="o">;</span> <span class="n">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">format</span><span class="o">(</span><span class="s">"🍕 cooking %d pizza(s)%n"</span><span class="o">,</span> <span class="n">amount</span><span class="o">);</span> <span class="o">}</span> <span class="o">}</span></code></pre> <p>We further have a task class to remotely command <code>PizzaService</code> to cook:</p> <pre><code class="language-java"><span class="kn">import</span> <span class="nn">java.io.Serializable</span><span class="o">;</span> <span class="kd">public</span> <span class="kd">static</span> <span class="kd">class</span> <span class="nc">PizzaCookTask</span> <span class="kd">implements</span> <span class="n">Serializable</span><span class="o">,</span> <span class="n">Runnable</span> <span class="o">{</span> <span class="nd">@Inject</span> <span class="kd">private</span> <span class="n">PizzaService</span> <span class="n">pizzaService</span><span class="o">;</span> <span class="kd">private</span> <span class="kd">final</span> <span class="kt">int</span> <span class="n">amount</span><span class="o">;</span> <span class="kd">public</span> <span class="nf">PizzaMakeTask</span><span class="o">(</span><span class="kt">int</span> <span class="n">amount</span><span class="o">)</span> <span class="o">{</span> <span class="k">this</span><span class="o">.</span><span class="na">amount</span> <span class="o">=</span> <span class="n">amount</span><span class="o">;</span> <span class="o">}</span> <span class="nd">@Override</span> <span class="kd">public</span> <span class="kt">void</span> <span class="nf">run</span><span class="o">()</span> <span class="o">{</span> <span class="n">pizzaService</span><span class="o">.</span><span class="na">cook</span><span class="o">(</span><span class="n">amount</span><span class="o">);</span> <span class="o">}</span> <span class="o">}</span></code></pre> <p>A naive approach to run this task on an <code>IExecutorService</code> would result in the following code:</p> <pre><code class="language-java"><span class="kn">import</span> <span class="nn">com.hazelcast.core.Hazelcast</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">com.hazelcast.core.HazelcastInstance</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">com.hazelcast.core.IExecutorService</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">java.util.concurrent.CompletableFuture</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">java.util.concurrent.TimeUnit</span><span class="o">;</span> <span class="kd">public</span> <span class="kd">enum</span> <span class="n">HzGuiceDemo</span> <span class="o">{;</span> <span class="kd">public</span> <span class="kd">static</span> <span class="kt">void</span> <span class="nf">main</span><span class="o">(</span><span class="n">String</span><span class="o">[]</span> <span class="n">args</span><span class="o">)</span> <span class="kd">throws</span> <span class="n">Throwable</span> <span class="o">{</span> <span class="n">HazelcastInstance</span> <span class="n">hzInstance</span> <span class="o">=</span> <span class="n">Hazelcast</span><span class="o">.</span><span class="na">newHazelcastInstance</span><span class="o">();</span> <span class="n">IExecutorService</span> <span class="n">hzExecutorService</span> <span class="o">=</span> <span class="n">hzInstance</span><span class="o">.</span><span class="na">getExecutorService</span><span class="o">(</span><span class="s">"ballpark"</span><span class="o">);</span> <span class="n">hzExecutorService</span><span class="o">.</span><span class="na">executeOnAllMembers</span><span class="o">(</span><span class="k">new</span> <span class="nf">PizzaCookTask</span><span class="o">(</span><span class="mi">1</span><span class="o">));</span> <span class="n">hzInstance</span><span class="o">.</span><span class="na">shutdown</span><span class="o">();</span> <span class="o">}</span> <span class="o">}</span></code></pre> <p>which fails with a sweet <code>NullPointerException</code> as follows:</p> <pre><code>Exception in thread "main" java.util.concurrent.ExecutionException: java.util.concurrent.ExecutionException: java.lang.NullPointerException at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915) at com.vlkan.hzguicedemo.HzGuiceDemo.main(HzGuiceDemo.java:??) Caused by: java.util.concurrent.ExecutionException: java.lang.NullPointerException at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:192) at com.hazelcast.executor.DistributedExecutorService$CallableProcessor.run(DistributedExecutorService.java:189) at com.hazelcast.util.executor.CachedExecutorServiceDelegate$Worker.run(CachedExecutorServiceDelegate.java:186) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) at com.hazelcast.util.executor.HazelcastManagedThread.executeRun(HazelcastManagedThread.java:76) at com.hazelcast.util.executor.HazelcastManagedThread.run(HazelcastManagedThread.java:92) Caused by: java.lang.NullPointerException at com.vlkan.hzguicedemo.HzGuiceDemo$PizzaCookTask.call(HzGuiceDemo.java:??) at com.vlkan.hzguicedemo.HzGuiceDemo$PizzaCookTask.call(HzGuiceDemo.java:??) </code></pre> <p>What is really happening here is that Hazelcast does not have a magical ball to guess the dependency injection framework you are using to process the <code>@Inject</code>-annotated properties of the <code>PizzaCookTask</code>. Though Hazelcast has something else: <a href="http://docs.hazelcast.org/docs/2.3/manual/html/ch14s02.html">ManagedContext</a>. In a nutshell, <code>ManagedContext</code> provides means to intercept class instantiation at deserialization. We can leverage this functionality to come up with a <code>ManagedContext</code> implementation that bakes Guice dependency injection into the Hazelcast class instantiation process.</p> <pre><code class="language-java"><span class="kn">import</span> <span class="nn">com.google.inject.Injector</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">com.hazelcast.core.ManagedContext</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">javax.inject.Inject</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">javax.inject.Singleton</span><span class="o">;</span> <span class="nd">@Singleton</span> <span class="kd">public</span> <span class="kd">class</span> <span class="nc">HazelcastGuiceManagedContext</span> <span class="kd">implements</span> <span class="n">ManagedContext</span> <span class="o">{</span> <span class="kd">private</span> <span class="kd">final</span> <span class="n">Injector</span> <span class="n">injector</span><span class="o">;</span> <span class="nd">@Inject</span> <span class="kd">public</span> <span class="nf">HazelcastGuiceManagedContext</span><span class="o">(</span><span class="n">Injector</span> <span class="n">injector</span><span class="o">)</span> <span class="o">{</span> <span class="k">this</span><span class="o">.</span><span class="na">injector</span> <span class="o">=</span> <span class="n">injector</span><span class="o">;</span> <span class="o">}</span> <span class="nd">@Override</span> <span class="kd">public</span> <span class="n">Object</span> <span class="nf">initialize</span><span class="o">(</span><span class="n">Object</span> <span class="n">instance</span><span class="o">)</span> <span class="o">{</span> <span class="n">injector</span><span class="o">.</span><span class="na">injectMembers</span><span class="o">(</span><span class="n">instance</span><span class="o">);</span> <span class="k">return</span> <span class="n">instance</span><span class="o">;</span> <span class="o">}</span> <span class="o">}</span></code></pre> <p>Next all you need to do is to use this <code>ManagedContext</code> while creating your <code>HazelcastInstance</code>:</p> <pre><code class="language-java"><span class="n">Injector</span> <span class="n">injector</span> <span class="o">=</span> <span class="n">Guice</span><span class="o">.</span><span class="na">createInjector</span><span class="o">();</span> <span class="n">HazelcastGuiceManagedContext</span> <span class="n">guiceManagedContext</span> <span class="o">=</span> <span class="n">injector</span><span class="o">.</span><span class="na">getInstance</span><span class="o">(</span><span class="n">HazelcastGuiceManagedContext</span><span class="o">.</span><span class="na">class</span><span class="o">);</span> <span class="n">Config</span> <span class="n">hzConfig</span> <span class="o">=</span> <span class="k">new</span> <span class="nf">Config</span><span class="o">();</span> <span class="n">hzConfig</span><span class="o">.</span><span class="na">setManagedContext</span><span class="o">(</span><span class="n">guiceManagedContext</span><span class="o">);</span> <span class="n">HazelcastInstance</span> <span class="n">hzInstance</span> <span class="o">=</span> <span class="n">Hazelcast</span><span class="o">.</span><span class="na">newHazelcastInstance</span><span class="o">(</span><span class="n">hzConfig</span><span class="o">);</span></code></pre> <p>While I have provided an example for Guice, this method is applicable to any dependency injection framework that provides an equivalent to <code>Injector#injectMembers()</code> of Guice. Needless to say, but Spring folks are already covered by <code>SpringManagedContext</code> shipped with Hazelcast.</p> tag:vlkan.com,2017-04-18://blog/post/2017/04/18/inter-service-comm/ Inter-Microservice Communication Fatigue 2017-04-18T17:22:00Z 2017-04-18T17:22:00Z <p>Let me get this straight: <strong>Every single line of code that needs to communicate with a remote microservice is the most bizarre, annoying, sad, and hopeless experience in my daily coding routine.</strong> And the worst is: Most of the time its <em>my client</em> code communicating with <em>my services</em>, so there is no one else to blame that would sooth my anger. But I did not end up here out of blue.</p> <p>In my freshman years, I was given the responsibility of further development of a microservice, where both the server and its driver (API models, HTTP client, etc.) were written in Scala. Because Scala was still a cool buzzword back then and the team wanted to experiment with it. (No, this won’t be a Scala FUD post.) It was using an in-house built HTTP client, which is more or less yet another buggy wrapper over an ancient version of <a href="https://github.com/ning/async-http-client">Ning async-http-client</a>. I implemented a (yes, another!) thin wrapper over it to expose the HTTP response models as <code>scala.concurrent.Future</code>s, so we can compose them via Scala’s for-comprehensions. (It did not take long for me to figure out that exposing the API in Scala was one of the worst possible design decisions one could have made in an ecosystem dominated by Java consumers.)</p> <p>Later on as a team we adopted another critical microservice comprising a giant Java client with Spring fizz and buzz, caching, Guava immutables all under the hood with insanely strict <code>checkNotNull</code>/<code>checkArgument</code>-powered model validation, etc. This comet created its own fan clubs. There are two types of people in the company who are consuming this service:</p> <ol> <li> <p>ones that bite the bullet and use the gigantic driver we provide (say hello to a truck load of artifacts in your not-sufficiently-sucking dependency hell) or</p> </li> <li> <p>ones that prefer to implement his/her own HTTP driver hosting an empire of bugs by manually building/parsing query request/response models formatted in JSON/XML/Protobuf.</p> </li> </ol> <p>Later on I said enough is enough! Let’s stick to a standard: JEE HTTP client, that is, Jersey JAX-RS Client with Jackson cream on it. I still needed to create all the API models and verify them myself every time. It was bearable to some extent. But here comes the perfect storm: JAX-RS Client 2.0 (which supports proper HTTP connection pooling with sanely configurable socket+connection timeout support, which weren’t sanely available in 1.x) requires <code>javax.ws.rs-api</code> 2.x, which is binary incompatible with 1.x, which is used by 80% of microservices in the ecosystem. So in practice no other microservice will be able to use my driver without the developer losing half of his/her hairs.</p> <p>Later on I kept repeating “enough is enough”! Let’s use <a href="https://github.com/AsyncHttpClient/async-http-client">Google’s async-http-client</a>. It is pluggable all around the place: the HTTP connector (Apache HC, etc.), marshaller (Jackson, Gson, etc.). The project is more or less undocumented. But thanks to an army of Android users, there is plenty of blog posts and source code to discover your own personal bugs, so you can keep on blog posting about it. Anyway… It worked. I still need to motivate myself to dive into the source code to comprehend how it works under the hood, but it worked.</p> <p>Today… When I need to talk to one of these services I need to pick a lousy, juice, smelly sh*t of my preference:</p> <ul> <li> <p>Inject the entire Scala milky way into your 1-file Java microservice, which could have been delivered as a 5MB fat-JAR before Scala space-warping it into 50MB. And don’t forget to pat your IDE in the back every time it needs to auto-complete a Scala class. Oh, by the way, have you ever tried accessing a <code>scala.Option</code> from Java? Boy! It is fun! I hope your service consumers think likewise.</p> </li> <li> <p>Let the giant Java driver bring all its feature-rich functionality together with its cousins, its nephews, its uncle, its grandma, its grandpa, its friends from the school, its ex, and of course with spring-core. All you wanted is to make a <code>GET</code> to <code>/v1/user/&lt;id&gt;</code>, but now you have the entire Pivotal art gallery decorating your <code>mvn dependency:tree</code> output on the wall.</p> </li> <li> <p>You can of course purpose maven-shade-plugin to shade and relocate the entire <code>javax.ws.rs-api</code>, Jersey dependencies, together with the entire universe. I know you can do that.</p> </li> <li> <p>Browse to Google’s <code>async-http-client</code> webpage and try to find the page that explains how to make a simple fscking <code>GET</code> request.</p> </li> <li> <p>Embrace the old Ning client wrapper, welcome bugs (the first time I needed to use it I found out that <code>setHeaders()</code> wasn’t working as expected), stick to JAX-RS 1.x and an ancient Netty version, which causes a JAR Hell with any recent library, e.g., Elasticsearch. (Please refer back to maven-shade-plugin item.)</p> </li> </ul> <p>I can hear you shouting about compile-time generated HTTP clients based on Swagger or WADL specs. But weren’t we just cursing WSDL and trying to run away from it? <a href="square.github.io/retrofit/">Retrofit</a>? <a href="https://twitter.github.io/finagle/">Finagle</a>? <a href="http://www.grpc.io/">gRPC</a>? I bet it is a matter of time until you end up needing to consume two clients which have transitive dependencies to two binary incompatible versions of Retrofit/Finagle/gRPC. You can blame the Java class loader mechanism. But that doesn’t make the problem fade away. Oh! I was just about to forget! Wait until I migrate to <code>rx.Completable</code> from <code>rx.Single&lt;Void&gt;</code>, which I migrated from <code>rx.Observable&lt;Void&gt;</code>.</p> <p>I am exhausted and demotiviated to write yet another single line of code that needs to communicate with a remote microservice and which could have been a simple fucking RPC. I don’t have a solution for the mud ball in my hands. Even if I do have, I am not sure whether it will survive a couple of years or not. But in the back of my head, I keep on cursing the Java Platform SE guys: How difficult could it be to come up with a proper pluggable HTTP client? Compared to <code>NPE</code>, Java’s HTTP client is not <em>the</em> billion dollar mistake, but a really close one.</p> tag:vlkan.com,2016-10-04://blog/post/2016/10/04/coders-at-work/ Notes on "Coders at Work" 2016-10-04T18:40:00Z 2016-10-04T18:40:00Z <p>There is nothing like thinking about work while your are on vacation. And that was indeed what I did: Reading <a href="http://www.codersatwork.com/">Coders at Work: Reflections on the Craft of Programming</a> in a tango-themed Buenos Aires trip.</p> <p>I had already met with Peter Seibel in his well-known splendid work: <a href="http://www.gigamonkeys.com/book/">Practical Common Lisp</a>. He definitely has a knack for transforming technically challenging problems into a pedagogically digestable <a href="https://en.wikipedia.org/wiki/Nootropic">nootropic</a>. This uncanny ability was placidly lurid in Coders at Work as well. Motivated by the mind provoking exquisite content, I felt an urge to keep a record of its reflections on me.</p> <h1 id="on-the-content">On the Content</h1> <p>I totally enjoyed the book and read it cover to cover. Nevertheless, I believe the following subtleties could have been addressed in a better way.</p> <ul> <li> <p>A majority of the interviewed <em>coders</em> are not actively <em>coding</em> any more. I find this a little bit contradictory with the title of the book. While the content still makes a great deal about the historical progress of programming and programmers, I find the detachment of the interviewees from the modern computing slightly unexpected.</p> </li> <li> <p>Given the back that the book examines the events dating back to more than half a century, I sometimes find myself lost in the time context. Additional footnotes to enhance these kind of ambiguities could have been useful.</p> </li> </ul> <h1 id="highligts">Highligts</h1> <p>Below I collected my personal highligts on certain statements that are shared by certain interviewees.</p> <ul> <li> <p>In general, <em>coders</em> do not practice software testing extensively. Further, I had the impression that that they do not read much programming books either. This issue sometimes acclaimed to the lack of necessary set of fundamental books at the early stages of their career.</p> </li> <li> <p>Among the entire deck, I find Joshua Bloch, Bernie Cosell, and Donald Knuth the ones with the most sensible and to-the-earth statements.</p> </li> <li> <p>A notable subset of the interviewees dragged into computers not by a deliberate decision, but by chosing a yet another career path that was available to them. (For instance, Fran Allen got a Fortran instructor position in IBM in order to finance her school loans to be able to continue her math teacher career pursuit.)</p> </li> <li> <p>None believes that reading Knuth’s <a href="https://en.wikipedia.org/wiki/The_Art_of_Computer_Programming">The Art of Computer Programming</a> is a must read for programmers, nevertheless they acknowledge that it is good to have it under your hand for reference.</p> </li> <li> <p>Except Knuth himself, nobody practices literate programming. (I am astounded to observe how Seibel is biased to ask this non-sense question which delivers no practical value at all to every single candidate.)</p> </li> <li> <p>Majority agrees that programming is a way more complicated and challenging occupation than it once used to be in the past.</p> </li> <li> <p>More than half thinks that good writing skills are a big plus (some even state necessity) for programming.</p> </li> <li> <p><code>printf</code> is the clear winner as the debugging tool of preference among interviewees.</p> </li> </ul> <h1 id="quotes">Quotes</h1> <p>Below you can find some snippets that I find worth mentioning from the book.</p> <h2 id="jamie-zawinski">Jamie Zawinski</h2> <p>I wish I would have known this when I was in high school. Could not agree more.</p> <blockquote> <p><strong>Zawinski:</strong> When you’re in high school, everyone tells you, “There’s a lot of repetitive bullshit and standardized tests; it’ll all be better once you’re in college.” And then you get to your first year of college and they’re like, “Oh, no – it gets better when you’re in grad school.” So it’s just same shit, different day – I couldn’t take it. [p5]</p> </blockquote> <p>His comments on C++, which are shared by many other interviewees throughout the book:</p> <blockquote> <p><strong>Zawinski:</strong> … when you’re programming C++ no one can ever agree on which ten percent of the language is safe to use. [p20]</p> </blockquote> <p>The sad truth about F/OSS:</p> <blockquote> <p><strong>Seibel:</strong> Isn’t it exactly this thing – someone comes along and says, “I can’t understand this stuff. I’ll just rewrite it” – that leads to the endless rewriting you bemoan in open-source development?</p> <p><strong>Zawinski:</strong> Yeah. But there’s also another aspect of that which is, efficiency aside, it’s just more fun to write your own code than to figure out someone else’s. So it is easy to understand why that happens. But the whole Linux/GNOME side of things is straddling this line between someone’s hobby and a product. Is this a research project where we’re deciding what desktops should look like and we’re experimenting? Or are we competing with Macintosh? Which is it? Hard to do both. [p23]</p> </blockquote> <h2 id="brad-fitzpatrick">Brad Fitzpatrick</h2> <p>His thoughts on finishing a project, which I sadly share as well:</p> <blockquote> <p><strong>Fitzpatrick:</strong> The projects that I never finish … it’s because I did the hard part and I learned what I wanted to learn and I never got around to doing the boring stuff. [p20]</p> </blockquote> <p>He is also poisoned by LWN, Reddit, etc.</p> <blockquote> <p><strong>Fitzpatrick:</strong> I like working alone but I just bounce all over the place when I do. On a plane I’ll bring extra laptop batteries and I have a whole development environment with local web servers and I’ll be in a web browser, testing stuff. But I’ll still be hitting new tabs, and typing “reddit” or “lwn” – sites I read. Autocomplete and hit Enter, and then – error message. I’ll do this multiple times within a minute. Holy fuck! Do I do this at work? Am I reading web site this often that I don’t even think about it? It’s scary. I had a friend, who had some iptables rule, that on connection to a certain IP address between certain hours of the day would redirect to a “You should be working” page. I haven’t got around to doing that, but I need to do something like it, probably. [p73]</p> </blockquote> <h2 id="douglas-crockford">Douglas Crockford</h2> <p>Why programming is difficult?</p> <blockquote> <p><strong>Crockford:</strong> Part of what makes programming difficult is most of the time we’re doing stuff we’ve never done before. [p110]</p> </blockquote> <p>He talks about his preferred way for interviewing job candidates, which is also shared by other coders in the book.</p> <blockquote> <p><strong>Crockford:</strong> The approach I’ve taken now is to do a code reading. I invite the candidate to bring in a piece of code he’s really proud of and walk us through it. [p129]</p> </blockquote> <h2 id="brendan-eich">Brendan Eich</h2> <p>Nothing noteworthy, you may guess why.</p> <h2 id="joshua-bloch">Joshua Bloch</h2> <p>Is Java off in the weeds?</p> <blockquote> <p><strong>Seibel:</strong> … is Java off in the weeds a little bit? Is it getting more complex faster than it’s getting better?</p> <p><strong>Bloch:</strong> That’s a very difficult question. In particular, the Java 5 changes added far more complexity than we ever intended. I had no understanding of just how much complexity generics and, in particular, wildcards were going to add to the language. I have to give credit where is due – Graham Hamilton did understand this at the time and I didn’t.</p> <p>The funny things is, he fought against it for years, trying to keep generics out of the language. But the notion of variance – the idea behind wildcards – came into fashion during the years when generics were successfully being kept out of Java. If they had gone in earlier, without variance, we might have had a simpler, more tractable language today.</p> <p>That said, there are real benefits to wildcards. There’s a fundamental impedance mismatch between subtyping and generics, and wildcards go a long way towards rectifying the mismatch. But at a significant cost in terms of complexity. THere are some people who believe that declaration-site, as opposed to use-site, variance is a better solution, but I’m not so sure.</p> <p>The jury is basically still out on anything that hasn’t been tested by a huge quantity of programmers under real-world conditions. Often languages only succeed in some niche and people say, “Oh, they’re great and it’s such a pity they didn’t become the successful language in the world.” But often there are reasons they didn’t. Hopefully some language that does use declaration-site variance, like Scala or C# 4.0, will answer this question once and for all. [p191]</p> </blockquote> <p>On “obviously no deficiencies” versus “no obvious deficiencies”:</p> <blockquote> <p><strong>Bloch:</strong> There’s a brilliant quote by Tony Hoare in his Turing Award speech about how there are two ways to design a system: “One way is to make it so simple that there are <em>obviously</em> no deficiencies and the other way is to make it is so complicated that there are no <em>obvious</em> deficiencies.”</p> <p>The paragraph that follows is equally brilliant, though it isn’t as well-known: “The first method is far more difficult. It demands the same skill, devotion, insight, and even inspiration as the discovery of the simple physical laws which underlie the complex phenomena of nature. It also requires a willingness to accept objectives which are limited by physical, logical, and technological constraints, and to accept a compromise when conflicting objectives cannot be met. No committee will ever do this until it is too late.” [p197]</p> </blockquote> <p>Smart people and programming:</p> <blockquote> <p><strong>Seibel:</strong> Speaking of writing intricate code, I’ve noticed that people who are too smart, in a certain dimension anyway, make the worst code. Because they can actually fit the whole thing in their head they can write these great reams of spaghetti code.</p> <p><strong>Bloch:</strong> I agree with you that people who are both smart enough to cope with enormous complexity and lack empathy with the rest of use may fail prey to that. They think, “I can understand this and I can use it, so it has to be good.” [p202]</p> <p>…</p> <p>There’s this problem, which is, programming is so much of an intellectual meritocracy and often these people are the smartest people in the organization; therefore they figure they should be allowed to make all the decisions. But merely the fact that they’re the smartest people in the organization doesn’t mean they should be making all the decisions, because intelligence is not a scalar quantity; it’s a vector quantity. And if you lack empathy or emotional intelligence, then you shouldn’t be designing APIs or GUIs or languages. [p203]</p> </blockquote> <h2 id="joe-armstrong">Joe Armstrong</h2> <p>On paralyses of choice:</p> <blockquote> <p><strong>Armstrong:</strong> The funny thing is, thinking back, I don’t think all these modern gizmos actually make you any more productive. Hierarchical file systems – how do they make you more productive? Most of software development goes on in your head anyway. I think having worked with that simpler system imposes a kind of disciplined way of thinking. If you haven’t got a directory system and you have to put all the files in one directory, you have to be fairly disciplined. If you haven’t got a revision control system, you have to be fairly disciplined. Given that you apply that discipline to what you’re doing it doesn’t seem to me to be any better to have hierarchical file systems and revision control. They don’t solve the fundamental problem of solving your problem. They probably make it easier for groups of people to work together. For individuals I don’t see any difference.</p> <p>Also, I think today we’re kind of overburdened by choice. I mean, I just had Fortran. I don’t think we even had shell scripts. We just had batch files so you could run things, a compiler, and Fortran. And assembler possibly, if you really needed it. So there wasn’t this agony of choice. Being a young programmer today must be awful – you can choose 20 different programming languages, dozens of framework and operating systemsand you’re paralyzed by choice. There was no paralysis of choice then. You just start doing it because the decision as to which language and things is just made – there’s no thinking about what you should do, you just go and do it. [p210]</p> </blockquote> <h2 id="simon-peyton-jones">Simon Peyton Jones</h2> <p>Testing an API in Microsoft:</p> <blockquote> <p><strong>Peyton Jones:</strong> Well, they also do some interesting work on testing APIs. Steven Clarke and his colleagues at Redmond have made systematic attempts to watch programmers, given a new API, talk through what they’re trying to do. And they get the people who designed the API to sit behind a glass screen and watch them.</p> <p>And the guys sitting there behind the glass screen say, “No, no, don’t do that! That’s not the right way!” But it’s soundproof. That turns out often to be very instructive. They go and change their API. [p253]</p> </blockquote> <h2 id="peter-norvig">Peter Norvig</h2> <p>On the traditional master and apprentice approach:</p> <blockquote> <p><strong>Norvig:</strong> But I think part of the reasons why you had master and apprentice is because the materials were rarer. When you were doing goldsmithing, there’s only so much gold. Or when the surgeon’s operating, there’s only one heart, and so you want the best person on that and you want the other guys just helping. With coding, it’s not like that. You’ve got plenty of terminals. You’ve got plenty of keyboards. You don’t have to ration it. [p295]</p> </blockquote> <p>Why programming is not an art, but a craft:</p> <blockquote> <p><strong>Seibel:</strong> As a programmer, do you consider yourself a scientist, an engineer, an artist, or a craftsman?</p> <p><strong>Norvig:</strong> Well, I know when you compare the various titles of books and so on, I always thought the “craft” was the right answer. So I thought art was a little pretentious because the purpose of art is to be beautiful or to have an emotional contact or emotional impact, and I don’t feel like that’s anything that I try to do. Certainly I want programs to be pretty in some ways, and sometimes I feel like I spend too much time doing that. I’ve been in a position where I’ve had the luxury to say, “Gee, I have time to go back and pretty this up a little bit.” And places where I’ve been able to write for a publication, you spend more time doing that than you would if it was just for your own professional growth.</p> <p>But I don’t think of that as art. I think <em>craft</em> is really the right word for it. You can make a chair, and it’s good looking, but it’s mostly functional – it’s a chair. [p319]</p> </blockquote> <h2 id="guy-steele">Guy Steele</h2> <p>On the difficulty of getting a program right:</p> <blockquote> <p><strong>Steele:</strong> I’ll give you another example – suppose I were to tell my smart computer, “OK, I’ve got this address book and I want the addresses to always be in sorted order,” and it responds by throwing away everything but the first entry. Now the address book is sorted. But that’s not what you wanted. It turns out that just specifying something as simple as “a list is in sorted order and I haven’t lost any of the data and nothing has been duplicated” is actually a fairly tricky specification to write. [p361]</p> </blockquote> <h2 id="dan-ingalls">Dan Ingalls</h2> <p>Was a nice read, though I could not find anything particularly interesting worth sharing. Nevertheless, along the lines Seibel says something that I have never heard of:</p> <blockquote> <p><strong>Seibel:</strong> Alan Kay has said that both Lisp and Smalltalk have the problem that they’re so good they eat their children. If you had known Lisp, then Smalltalk would have been the first eaten child. [p378]</p> </blockquote> <h2 id="l-peter-deutsch">L Peter Deutsch</h2> <p>On getting data structures right:</p> <blockquote> <p><strong>Deutsch:</strong> … if you get the data structures and their invariants right, most of the code will just kind of write itself. [p420]</p> </blockquote> <p>Conceptualization of software and memory pointers:</p> <blockquote> <p><strong>Deutsch:</strong> … I don’t look around and see anything that looks like an address or a pointer. We have objects; we don’t have these weird things that computer scientists misname “objects.”</p> <p><strong>Seibel:</strong> To say nothing of the scale. Two to the 64th of anything is a lot, and things happening billions of times a second is fast.</p> <p><strong>Deutsch:</strong> But that doesn’t bother us here in the real world. You know Avogadro’s number, right? Ten to the 23rd? So, we’re looking here around at a world that has incredible numbers of little things all clumped together and happening at the same time. It doesn’t bother us because the world is such that you don’t have to understand this table at a subatomic level. The physical properties of matter are such that 99.9 percent of the time you can understand it in aggregate. And everything you have to know about it, you can understand from dealing with it in aggregate. To a great extent, that is not true in the world of software.</p> <p>People keep trying to do modularization structures for software. And the state of that art has been improving over time, but it’s still, in my opinion, very far away from the ease with which we look around and see things that have, whatever it is, 10 to the 23rd atoms in them, and it doesn’t even faze us.</p> <p>Software is a discipline of detail, and that is a deep, horrendous fundamental problem with software. Until we understand how to conceptualize and organize software in a way that we don’t have to think about how every little piece interacts with every other piece, things are not going to get a whole lot better. And we’re very far from being there. [p424]</p> </blockquote> <h2 id="ken-thompson">Ken Thompson</h2> <p>On teaching:</p> <blockquote> <p><strong>Thompson:</strong> … I love the teaching: the hard work of a first class, the fun of the second class. Then the misery of the third. [p455]</p> </blockquote> <p>What I am supposed to do and what I am actually doing:</p> <blockquote> <p><strong>Thompson:</strong> We were supposed to be doing basic research but there was some basic research we should be doing and some basic research we shouldn’t be doing. And just coming out of the ashes of MULTICS, operating systems was one of those basic research things we shouldn’t be doing. Because we tried it, it didn’t work, it was a huge failure, it was expensive; let’s drop it. So I kind of expected that for what I was doing I was going to eventually get fired. I didn’t. [p458]</p> </blockquote> <p>Code rots:</p> <blockquote> <p><strong>Thompson:</strong> Code by itself almost rots and it’s gotta be rewritten. Even when nothing has changed, for some reason it rots. [p460]</p> </blockquote> <p>10 percent of the work:</p> <blockquote> <p><strong>Thompson:</strong> NELIAC was a system-programming version of Algol 58.</p> <p>Seibel: Was Bliss also from that era?</p> <p><strong>Thompson:</strong> Bliss I think was after. And their emphasis was trying to compile well. I think it was pretty clear from the beginning that you shouldn’t kill yourself compiling well. You should do well but not really good. And the reason is that in the time it takes you to go from well to really good, Moore’s law has already surpassed you. You can pick up 10 percent but while you’re picking up that 10 percent, computers have gotten twice as fast and maybe with some other stuff that matters more for optimization, like caches. I think it’s largely a waste of time to do really well. It’s really hard; you generate as many bugs as you fix. You should stop, not take that extra 100 percent of time to do 10 percent of the work. [p462]</p> </blockquote> <p>Writing an OS to test a file system:</p> <blockquote> <p><strong>Seibel:</strong> So you basically wrote an OS so you’d have a better environment to test your file system.</p> <p><strong>Thompson:</strong> Yes. Halfway through there that I realized it was a real time- sharing system. I was writing the shell to drive the file system. And then I was writing a couple other programs that drove the file system. And right about there I said, “All I need is an editor and I’ve got an operating system.” [p465]</p> </blockquote> <p>Economics of deciding on introducing a bag:</p> <blockquote> <p><strong>Thompson:</strong> Certainly every time I’ve written one of these non-compare subroutine calls, strcpy and stuff like that, I know that I’m writing a bug. And I somehow take the economic decision of whether the bug is worth the extra arguments. [p468]</p> </blockquote> <p>On testing:</p> <blockquote> <p><strong>Thompson:</strong> … Mostly just regression tests.</p> <p><strong>Seibel:</strong> By things that are harder to test, you mean things like device drivers or networking protocols?</p> <p><strong>Thompson:</strong> Well, they’re run all the time when you’re actually running an operating system.</p> <p><strong>Seibel:</strong> So you figure you’ll shake the bugs out that way?</p> <p><strong>Thompson:</strong> Oh, absolutely. I mean, what’s better as a test of an operating system than people beating on it? [p469]</p> </blockquote> <p>Code at Google:</p> <blockquote> <p><strong>Thompson:</strong> I guess way more than 50 percent of the code is the what-if kind. [p473]</p> </blockquote> <p>On literate programming:</p> <blockquote> <p><strong>Seibel:</strong> When I interviewed him, Knuth said the key to technical writing is to say everything twice in complementary ways. So I think he sees that as a feature of literate programming, not a bug.</p> <p><strong>Thompson:</strong> Well if you have two ways, one of them is real: what the machine executes. [p477]</p> </blockquote> <h2 id="fran-allen">Fran Allen</h2> <p>What makes a program beautiful?</p> <blockquote> <p><strong>Allen:</strong> That it is a simple straightforward solution to a problem; that has some intrinsic structure and obviousness about it that isn’t obvious from the problem itself. [p489]</p> </blockquote> <h2 id="bernie-cosell">Bernie Cosell</h2> <p>Should we teach Knuth to students?</p> <blockquote> <p><strong>Cosell:</strong> I would not teach students Knuth per se for two reasons. First, it’s got all this mathematical stuff where he’s not just trying to present the algorithms but to derive whether they’re good or bad. I’m not sure you need that. I understand a little bit of it and I’m not sure I need any of it. But getting a feel for what’s fast and what’s slow and when, that’s an important thing to do even if you don’t know how much faster or how much slower.</p> <p>The second problem is once students get sensitive to that, they get too clever by half. They start optimizing little parts of the program because, “This is the ideal place to do an AB unbalanced 2-3 double reverse backward pointer cube thing and I always wanted to write one of those.” So they spend a week or two tuning an obscure part of a program that doesn’t need anything, which is now more complicated and didn’t make the program any better. So they need a tempered understanding that there are all these algorithms, how they work, and how to apply them. It’s really more of a case of how to pick the right one for the job you’re trying to do as opposed to knowing that this one is an order n-cubed plus three and this one is just order n-squared times four. [p527]</p> </blockquote> <p>Writing programs and learning how to program:</p> <blockquote> <p><strong>Cosell:</strong> The binary bits are what computers want and the text file is for me. I would get people – bright, really good people, right out of college, tops of their classes – on one of my projects. And they would know all about programming and I would give them some piece of the project to work on. And we would start crossing swords at our project-review meetings. They would say, “Why are you complaining about the fact that I have my global variables here, that I’m not doing this, that you don’t like the way the subroutines are laid out? The program works.”</p> <p>They’d be stunned when I tell them, “I don’t care that the program works. The fact that you’re working here at all means that I expect you to be able to write programs that work. Writing programs that work is a skilled craft and you’re good at it. Now, you have to learn how to program.” [p543]</p> </blockquote> <p>Convictions:</p> <blockquote> <p><strong>Cosell:</strong> I had two convictions, which actually served me well: that programs ought to make sense and there are very, very few inherently hard problems. [p549]</p> </blockquote> <p>How long is it going to take you to put this change in?</p> <blockquote> <p><strong>Cosell:</strong> So when they ask, “How long is it going to take you to put this change in?” you have three answers. The first is the absolute shortest way, changing the one line of code. The second answer is how long it would be using my simple rule of rewriting the subroutine as if you were not going to make that mistake. Then the third answer is how long if you fix that bug if you were actually writing this subroutine in the better version of the program. [p550]</p> </blockquote> <p>Artistry in programming:</p> <blockquote> <p><strong>Cosell:</strong> Part of what I call the artistry of the computer program is how easy it is for future people to be able to change it without breaking it. [p555]</p> </blockquote> <p>Difficulty of programming and C:</p> <blockquote> <p><strong>Cosell:</strong> … programmers just can’t be careful enough. They don’t see all the places. And C makes too many places. Too scary for me, and I guess it’s fair to say I’ve programmed C only about five years less than Ken has. We’re not in the same league, but I have a long track record with C and know how difficult it is and I think C is a big part of the problem. [p559]</p> </blockquote> <p>75 million run-of-the-mill programmers and Java:</p> <blockquote> <p><strong>Cosell:</strong> When I first messed with Java – this was when it was little baby language, of course – I said, “Oh, this is just another one of those languages to help not-so-good programmers go down the straight and narrow by restricting what they can do.” But maybe we’ve come to a point where that’s the right thing. Maybe the world has gotten so dangerous you can’t have a good, flexible language that one percent or two percent of the programmers will use to make great art because the world is now populated with 75 million run-of-the-mill programmers building these incredibly complicated applications and they need more help than that. So maybe Java’s the right thing. I don’t know. [p560]</p> </blockquote> <p>Not-so-good programmers and C:</p> <blockquote> <p><strong>Cosell:</strong> I don’t want to say that C has outlived its usefulness, but I think it was used by too many good programmers so that now not-good-enough programmers are using it to build applications and the bottom line is they’re not good enough and they can’t. Maybe C is the perfect language for really good systems programmers, but unfortunately not-so-good systems and applications programmers are using it and they shouldn’t be. [p560]</p> </blockquote> <h2 id="donald-knuth">Donald Knuth</h2> <p>Teaching a class, writing a book, and programming:</p> <blockquote> <p><strong>Knuth:</strong> I could teach classes full-time and write a book full-time but software required so much attention to detail. It filled that much of my brain to the exclusion of other stuff. So it gave me a special admiration for people who do large software projects – I would never have guessed it without having been faced with that myself. [p572]</p> </blockquote> <p>Why isn’t everybody a super programmer and super writer?</p> <blockquote> <p><strong>Knuth:</strong> Now, why hasn’t this spread over the whole world and why isn’t everybody doing it? I’m not sure who it was who hit the nail on the head – I think it was Jon Bentley. Simplified it is like this: only two percent of the world’s population is born to be super programmers. And only two percent of the population is born to be super writers. And Knuth is expecting everybody to be both. [p574]</p> </blockquote> <p>Use of pointers in C:</p> <blockquote> <p><strong>Knuth:</strong> To me one of the most important revolutions in programming languages was the use of pointers in the C language. When you have nontrivial data structures, you often need one part of the structure to point to another part, and people played around with different ways to put that into a higher- level language. Tony Hoare, for example, had a pretty nice clean system but the thing that the C language added – which at first I thought was a big mistake and then it turned out I loved it – was that when x is a pointer and then you say, x + 1 , that doesn’t mean one more byte after x but it means one more node after x , depending on what x points to: if it points to a big node, x + 1 jumps by a large amount; if x points to a small thing, x + 1 just moves a little. That, to me, is one of the most amazing improvements in notation. [p585]</p> </blockquote> <p>I did not know about Knuth’s <em>change files</em>. But it seemed like an inconvenient overkill:</p> <blockquote> <p><strong>Knuth:</strong> I had written TeX and Metafont and people started asking for it. And they had 200 or 300 combinations of programming language and operating system and computer, so I wanted to make it easy to adapt my code to anybody’s system. So we came up with the solution that I would write a master program that worked at Stanford and then there was this add-on called a change file which could customize it to anybody else’s machine.</p> <p>A change file is a very simple thing. It consists of a bunch of little blobs of changes. Each change starts out with a few lines of code. You match until you find the first line in the master file that agrees with the first line of your change. When you get to the end of the part of the change that was supposed to match the master file, then comes the part which says, “Replace that by these lines instead.” [p586]</p> <p>The extreme example of this was when TeX was adapted to Unicode. They had a change file maybe 10 times as long as the master program. In other words, they changed from an 8-bit program to a 16-bit program but instead of going through and redoing my master program, they were so into change files that they just wrote their whole draft of what they called Omega as change files, as a million lines of change files to TeX’s 20,000 lines of code or something. So that’s the extreme. [p587]</p> </blockquote> <p>Is programming fun any more?</p> <blockquote> <p><strong>Knuth:</strong> So there’s that change and then there’s the change that I’m really worried about: that the way a lot of programming goes today isn’t any fun because it’s just plugging in magic incantations – combine somebody else’s software and start it up. It doesn’t have much creativity. I’m worried that it’s becoming too boring because you don’t have a chance to do anything much new. [p594]</p> </blockquote> <p>Code reading:</p> <blockquote> <p><strong>Knuth:</strong> … don’t only read the people who code like you. [p601]</p> </blockquote> tag:vlkan.com,2016-08-12://blog/post/2016/08/12/hotspot-heapdump-threadump/ Programmatically Taking Heap and Thread Dumps in HotSpot 2016-08-12T17:53:00Z 2016-08-12T17:53:00Z <p>While taking heap and thread dumps are one click away using modern JVM toolset, in many cases the deployment environment access restrictions render these options unusable. Hence, you might end up exposing these functionalities in certain ways like an internal REST interface. This implies a new nasty obstacle: You need to know how to programmatically take heap and thread dumps in a Java application. Unfortunately, there does not exist a standard interface to access these functionalities within the VM as of date. But if you are only concerned about HotSpot, then you are in luck!</p> <h1 id="heap-dumps">Heap Dumps</h1> <p>For heap dumps, once you get your teeth into a <a href="https://docs.oracle.com/javase/8/docs/jre/api/management/extension/com/sun/management/HotSpotDiagnosticMXBean.html">HotSpotDiagnosticMXBean</a>, you are safe to go. It already exposes a <a href="https://docs.oracle.com/javase/8/docs/jre/api/management/extension/com/sun/management/HotSpotDiagnosticMXBean.html#dumpHeap-java.lang.String-boolean-">dumpHeap()</a> method ready to be used.</p> <pre><code class="language-java"><span class="kn">import</span> <span class="nn">com.sun.management.HotSpotDiagnosticMXBean</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">javax.management.MBeanServer</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">java.io.File</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">java.io.IOException</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">java.lang.management.ManagementFactory</span><span class="o">;</span> <span class="kd">public</span> <span class="kd">enum</span> <span class="n">HotSpotHeapDumps</span> <span class="o">{;</span> <span class="kd">private</span> <span class="kd">static</span> <span class="kd">final</span> <span class="n">HotSpotDiagnosticMXBean</span> <span class="n">HOT_SPOT_DIAGNOSTIC_MX_BEAN</span> <span class="o">=</span> <span class="n">getHotspotDiagnosticMxBean</span><span class="o">();</span> <span class="kd">private</span> <span class="kd">static</span> <span class="n">HotSpotDiagnosticMXBean</span> <span class="nf">getHotspotDiagnosticMxBean</span><span class="o">()</span> <span class="o">{</span> <span class="n">MBeanServer</span> <span class="n">server</span> <span class="o">=</span> <span class="n">ManagementFactory</span><span class="o">.</span><span class="na">getPlatformMBeanServer</span><span class="o">();</span> <span class="k">try</span> <span class="o">{</span> <span class="k">return</span> <span class="n">ManagementFactory</span><span class="o">.</span><span class="na">newPlatformMXBeanProxy</span><span class="o">(</span> <span class="n">server</span><span class="o">,</span> <span class="n">HOT_SPOT_DIAGNOSTIC_MX_BEAN_NAME</span><span class="o">,</span> <span class="n">HotSpotDiagnosticMXBean</span><span class="o">.</span><span class="na">class</span><span class="o">);</span> <span class="o">}</span> <span class="k">catch</span> <span class="o">(</span><span class="n">IOException</span> <span class="n">error</span><span class="o">)</span> <span class="o">{</span> <span class="k">throw</span> <span class="k">new</span> <span class="nf">RuntimeException</span><span class="o">(</span><span class="s">"failed getting Hotspot Diagnostic MX bean"</span><span class="o">,</span> <span class="n">error</span><span class="o">);</span> <span class="o">}</span> <span class="o">}</span> <span class="kd">public</span> <span class="kt">void</span> <span class="nf">create</span><span class="o">(</span><span class="n">File</span> <span class="n">file</span><span class="o">,</span> <span class="kt">boolean</span> <span class="n">live</span><span class="o">)</span> <span class="kd">throws</span> <span class="n">IOException</span> <span class="o">{</span> <span class="n">HOT_SPOT_DIAGNOSTIC_MX_BEAN</span><span class="o">.</span><span class="na">dumpHeap</span><span class="o">(</span><span class="n">file</span><span class="o">.</span><span class="na">getAbsolutePath</span><span class="o">(),</span> <span class="n">live</span><span class="o">);</span> <span class="o">}</span> <span class="o">}</span></code></pre> <p>The second argument of <code>dumpHeap</code> denotes live objects, that is, objects that are reachable from others.</p> <p>Note that many real-world Java applications occupy quite some memory. As a result of this, created heap dump generally end up consuming significant amount of disk space. You need to come up with your own custom clean up mechanism to tackle this problem. (For instance, in a JAX-RS resource, you can purpose a custom <code>MessageBodyWriter</code> to delete the file after writing the entire file to the output stream.)</p> <h1 id="thread-dumps">Thread Dumps</h1> <p>When you think first about thread dumps, they just contain simple plain text data.</p> <pre><code>2016-08-12 18:40:46 Full thread dump OpenJDK 64-Bit Server VM (25.76-b198 mixed mode): "RMI TCP Connection(266)-127.0.0.1" #24884 daemon prio=9 os_prio=0 tid=0x00007f9474010000 nid=0x2cee runnable [0x00007f941571b000] java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) at java.net.SocketInputStream.read(SocketInputStream.java:170) at java.net.SocketInputStream.read(SocketInputStream.java:141) at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) at java.io.BufferedInputStream.read(BufferedInputStream.java:265) - locked &lt;0x00000005c086e8b0&gt; (a java.io.BufferedInputStream) at java.io.FilterInputStream.read(FilterInputStream.java:83) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:550) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:683) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$$Lambda$83/628845041.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Locked ownable synchronizers: - &lt;0x00000005c0489198&gt; (a java.util.concurrent.ThreadPoolExecutor$Worker) "JobScheduler FJ pool 0/4" #24883 daemon prio=6 os_prio=0 tid=0x00007f946415d800 nid=0x2ced waiting on condition [0x00007f94093d2000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for &lt;0x00000005d8a5f9e0&gt; (a jsr166e.ForkJoinPool) at jsr166e.ForkJoinPool.awaitWork(ForkJoinPool.java:1756) at jsr166e.ForkJoinPool.scan(ForkJoinPool.java:1694) at jsr166e.ForkJoinPool.runWorker(ForkJoinPool.java:1642) at jsr166e.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:108) Locked ownable synchronizers: - None </code></pre> <p>Unfortunately, thread dumps do not have a standard syntax. While there are various ways to produce this output, thread dump analysis software does not play well with them. For instance, <a href="https://www.ibm.com/developerworks/community/groups/service/html/communityview?communityUuid=2245aa39-fa5c-4475-b891-14c205f7333c">IBM Thread and Monitor Dump Analyzer for Java</a> cannot parse thread dumps created by VisualVM using JMX. At the end of the day, I always needed to fall back to a HotSpot thread dump.</p> <p><code>tools.jar</code> shipped with JDKs (&gt;=1.6) provide the magical <a href="http://www.docjar.com/docs/api/sun/tools/attach/HotSpotVirtualMachine.html">HotSpotVirtualMachine</a> class containing our saviour <code>remoteDataDump()</code> method. First add the following lines to your <code>pom.xml</code>:</p> <pre><code class="language-xml"><span class="nt">&lt;dependencyManagement&gt;</span> <span class="nt">&lt;dependencies&gt;</span> <span class="nt">&lt;dependency&gt;</span> <span class="nt">&lt;groupId&gt;</span>com.sun<span class="nt">&lt;/groupId&gt;</span> <span class="nt">&lt;artifactId&gt;</span>tools<span class="nt">&lt;/artifactId&gt;</span> <span class="nt">&lt;version&gt;</span>${java.version}<span class="nt">&lt;/version&gt;</span> <span class="nt">&lt;scope&gt;</span>system<span class="nt">&lt;/scope&gt;</span> <span class="nt">&lt;systemPath&gt;</span>${tools.jar}<span class="nt">&lt;/systemPath&gt;</span> <span class="nt">&lt;/dependency&gt;</span> <span class="nt">&lt;/dependencies&gt;</span> <span class="nt">&lt;/dependencyManagement&gt;</span> <span class="nt">&lt;profiles&gt;</span> <span class="c">&lt;!-- tools.jar path for GNU/Linux and Windows --&gt;</span> <span class="nt">&lt;profile&gt;</span> <span class="nt">&lt;id&gt;</span>default-tools.jar<span class="nt">&lt;/id&gt;</span> <span class="nt">&lt;activation&gt;</span> <span class="nt">&lt;file&gt;</span> <span class="nt">&lt;exists&gt;</span>${java.home}/../lib/tools.jar<span class="nt">&lt;/exists&gt;</span> <span class="nt">&lt;/file&gt;</span> <span class="nt">&lt;/activation&gt;</span> <span class="nt">&lt;properties&gt;</span> <span class="nt">&lt;tools.jar&gt;</span>${java.home}/../lib/tools.jar<span class="nt">&lt;/tools.jar&gt;</span> <span class="nt">&lt;/properties&gt;</span> <span class="nt">&lt;/profile&gt;</span> <span class="c">&lt;!-- tools.jar path for OSX --&gt;</span> <span class="nt">&lt;profile&gt;</span> <span class="nt">&lt;id&gt;</span>default-tools.jar-mac<span class="nt">&lt;/id&gt;</span> <span class="nt">&lt;activation&gt;</span> <span class="nt">&lt;file&gt;</span> <span class="nt">&lt;exists&gt;</span>${java.home}/../Classes/classes.jar<span class="nt">&lt;/exists&gt;</span> <span class="nt">&lt;/file&gt;</span> <span class="nt">&lt;/activation&gt;</span> <span class="nt">&lt;properties&gt;</span> <span class="nt">&lt;tools.jar&gt;</span>${java.home}/../Classes/classes.jar<span class="nt">&lt;/tools.jar&gt;</span> <span class="nt">&lt;/properties&gt;</span> <span class="nt">&lt;/profile&gt;</span> <span class="nt">&lt;/profiles&gt;</span></code></pre> <p>Then the rest is a matter of accessing to <code>HotSpotVirtualMachine</code> class:</p> <pre><code class="language-java"><span class="kn">import</span> <span class="nn">com.google.common.io.ByteStreams</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">com.sun.management.HotSpotDiagnosticMXBean</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">com.sun.tools.attach.AttachNotSupportedException</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">com.sun.tools.attach.VirtualMachine</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">sun.tools.attach.HotSpotVirtualMachine</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">java.io.IOException</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">java.io.InputStream</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">java.lang.management.ManagementFactory</span><span class="o">;</span> <span class="kd">public</span> <span class="kd">enum</span> <span class="n">HotSpotThreadDumps</span> <span class="o">{;</span> <span class="kd">public</span> <span class="n">String</span> <span class="nf">create</span><span class="o">()</span> <span class="kd">throws</span> <span class="n">AttachNotSupportedException</span><span class="o">,</span> <span class="n">IOException</span> <span class="o">{</span> <span class="c1">// Get the PID of the current JVM process.</span> <span class="n">String</span> <span class="n">selfName</span> <span class="o">=</span> <span class="n">ManagementFactory</span><span class="o">.</span><span class="na">getRuntimeMXBean</span><span class="o">().</span><span class="na">getName</span><span class="o">();</span> <span class="n">String</span> <span class="n">selfPid</span> <span class="o">=</span> <span class="n">selfName</span><span class="o">.</span><span class="na">substring</span><span class="o">(</span><span class="mi">0</span><span class="o">,</span> <span class="n">selfName</span><span class="o">.</span><span class="na">indexOf</span><span class="o">(</span><span class="sc">'@'</span><span class="o">));</span> <span class="c1">// Attach to the VM.</span> <span class="n">VirtualMachine</span> <span class="n">vm</span> <span class="o">=</span> <span class="n">VirtualMachine</span><span class="o">.</span><span class="na">attach</span><span class="o">(</span><span class="n">selfPid</span><span class="o">);</span> <span class="n">HotSpotVirtualMachine</span> <span class="n">hotSpotVm</span> <span class="o">=</span> <span class="o">(</span><span class="n">HotSpotVirtualMachine</span><span class="o">)</span> <span class="n">vm</span><span class="o">;</span> <span class="c1">// Request a thread dump.</span> <span class="k">try</span> <span class="o">(</span><span class="n">InputStream</span> <span class="n">inputStream</span> <span class="o">=</span> <span class="n">hotSpotVm</span><span class="o">.</span><span class="na">remoteDataDump</span><span class="o">())</span> <span class="o">{</span> <span class="kt">byte</span><span class="o">[]</span> <span class="n">bytes</span> <span class="o">=</span> <span class="n">ByteStreams</span><span class="o">.</span><span class="na">toByteArray</span><span class="o">(</span><span class="n">inputStream</span><span class="o">);</span> <span class="k">return</span> <span class="k">new</span> <span class="nf">String</span><span class="o">(</span><span class="n">bytes</span><span class="o">);</span> <span class="o">}</span> <span class="o">}</span> <span class="o">}</span></code></pre> <p>You finished writing this code, you clicked on the Run button of the IDE, and it worked like a charm. This get you so excited that you wanted to add this functionality to your JEE service! Or better: Turn this into a JAR and pass it to your client’s machine and watch them take their part in the joy of thread-dump-oriented debugging! And this is what you get in return:</p> <pre><code>java.lang.NoClassDefFoundError: com/sun/tools/attach/AttachNotSupportedException </code></pre> <p>Which indicates that you did not pay attention my words: <em><code>tools.jar</code> is shipped with JDKs.</em> So neither your flashy JEE application server, nor your client’s machine has a JDK, but a JRE. Rings a bell? Yes, you indeed can add <code>tools.jar</code> into the final WAR/JAR of your project:</p> <pre><code class="language-xml"><span class="nt">&lt;build&gt;</span> <span class="nt">&lt;plugins&gt;</span> <span class="c">&lt;!-- copy tools.jar from JAVA_HOME --&gt;</span> <span class="nt">&lt;plugin&gt;</span> <span class="nt">&lt;groupId&gt;</span>org.apache.maven.plugins<span class="nt">&lt;/groupId&gt;</span> <span class="nt">&lt;artifactId&gt;</span>maven-dependency-plugin<span class="nt">&lt;/artifactId&gt;</span> <span class="nt">&lt;executions&gt;</span> <span class="nt">&lt;execution&gt;</span> <span class="nt">&lt;id&gt;</span>copy-system-dependencies<span class="nt">&lt;/id&gt;</span> <span class="nt">&lt;phase&gt;</span>prepare-package<span class="nt">&lt;/phase&gt;</span> <span class="nt">&lt;goals&gt;</span> <span class="nt">&lt;goal&gt;</span>copy-dependencies<span class="nt">&lt;/goal&gt;</span> <span class="nt">&lt;/goals&gt;</span> <span class="nt">&lt;configuration&gt;</span> <span class="nt">&lt;outputDirectory&gt;</span>${project.build.directory}/${project.build.finalName}/WEB-INF/lib<span class="nt">&lt;/outputDirectory&gt;</span> <span class="nt">&lt;includeScope&gt;</span>system<span class="nt">&lt;/includeScope&gt;</span> <span class="nt">&lt;/configuration&gt;</span> <span class="nt">&lt;/execution&gt;</span> <span class="nt">&lt;/executions&gt;</span> <span class="nt">&lt;/plugin&gt;</span> <span class="nt">&lt;/plugins&gt;</span> <span class="nt">&lt;/build&gt;</span></code></pre> <p>Note that this approach incorporates a JDK-specific JAR into your application and assumes that the application will run on a HotSpot VM. But unfortunately this is the only way that I know of to produce a thread dump that works with thread dump analysis software. If you don’t have such a need and just want a crude JMX generated thread dump, check out <a href="https://java.net/projects/visualvm/sources/svn/content/branches/release134/visualvm/jmx/src/com/sun/tools/visualvm/jmx/impl/JmxSupport.java">JmxSupport.java</a> shipped with VisualVM.</p> tag:vlkan.com,2016-07-20://blog/post/2016/07/20/rxjava-backpressure/ Callback Blocking for Back-Pressure in RxJava 2016-07-20T20:32:00Z 2016-07-20T20:32:00Z <p>In a reactive application, you don’t necessarily have control over the production and/or consumption rate of certain streams. This speed mismatch can cause severe and hard to find bugs, which might be overlooked in development environments while bringing in the entire system down in production.</p> <h1 id="life-without-back-pressure">Life Without Back-Pressure</h1> <p>Consider the following example:</p> <pre><code class="language-java"><span class="kn">import</span> <span class="nn">com.google.common.base.Throwables</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">rx.Observable</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">java.util.concurrent.atomic.AtomicInteger</span><span class="o">;</span> <span class="kd">public</span> <span class="kd">enum</span> <span class="n">NoBackPressure</span> <span class="o">{;</span> <span class="kd">public</span> <span class="kd">static</span> <span class="kt">void</span> <span class="nf">main</span><span class="o">(</span><span class="n">String</span><span class="o">[]</span> <span class="n">args</span><span class="o">)</span> <span class="o">{</span> <span class="kt">long</span> <span class="n">producePeriod</span> <span class="o">=</span> <span class="mi">100</span><span class="o">;</span> <span class="kt">long</span> <span class="n">consumePeriod</span> <span class="o">=</span> <span class="mi">300</span><span class="o">;</span> <span class="n">AtomicInteger</span> <span class="n">pendingTaskCount</span> <span class="o">=</span> <span class="k">new</span> <span class="nf">AtomicInteger</span><span class="o">();</span> <span class="c1">// Create a fast producer emitting an infinite number of items.</span> <span class="n">createStream</span><span class="o">(</span><span class="n">producePeriod</span><span class="o">,</span> <span class="kc">true</span><span class="o">,</span> <span class="nl">pendingTaskCount:</span><span class="o">:</span><span class="n">incrementAndGet</span><span class="o">)</span> <span class="o">.</span><span class="na">flatMap</span><span class="o">(</span><span class="n">ignored</span> <span class="o">-&gt;</span> <span class="c1">// Create a slow consumer emitting just one item.</span> <span class="n">createStream</span><span class="o">(</span><span class="n">consumePeriod</span><span class="o">,</span> <span class="kc">false</span><span class="o">,</span> <span class="nl">pendingTaskCount:</span><span class="o">:</span><span class="n">decrementAndGet</span><span class="o">))</span> <span class="o">.</span><span class="na">take</span><span class="o">(</span><span class="mi">5</span><span class="o">)</span> <span class="o">.</span><span class="na">toBlocking</span><span class="o">()</span> <span class="o">.</span><span class="na">last</span><span class="o">();</span> <span class="n">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">format</span><span class="o">(</span><span class="s">"pending task count: %d\n"</span><span class="o">,</span> <span class="n">pendingTaskCount</span><span class="o">.</span><span class="na">get</span><span class="o">());</span> <span class="o">}</span> <span class="kd">private</span> <span class="kd">static</span> <span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span> <span class="n">Observable</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span> <span class="nf">createStream</span><span class="o">(</span><span class="kt">long</span> <span class="n">pausePeriodMillis</span><span class="o">,</span> <span class="kt">boolean</span> <span class="n">infinite</span><span class="o">,</span> <span class="n">Supplier</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span> <span class="n">body</span><span class="o">)</span> <span class="o">{</span> <span class="k">return</span> <span class="n">Observable</span><span class="o">.</span><span class="na">create</span><span class="o">(</span><span class="n">subscriber</span> <span class="o">-&gt;</span> <span class="o">{</span> <span class="k">new</span> <span class="nf">Thread</span><span class="o">()</span> <span class="o">{</span> <span class="nd">@Override</span> <span class="kd">public</span> <span class="kt">void</span> <span class="nf">run</span><span class="o">()</span> <span class="o">{</span> <span class="k">do</span> <span class="o">{</span> <span class="n">pause</span><span class="o">(</span><span class="n">pausePeriodMillis</span><span class="o">);</span> <span class="n">T</span> <span class="n">next</span> <span class="o">=</span> <span class="n">body</span><span class="o">.</span><span class="na">get</span><span class="o">();</span> <span class="n">subscriber</span><span class="o">.</span><span class="na">onNext</span><span class="o">(</span><span class="n">next</span><span class="o">);</span> <span class="o">}</span> <span class="k">while</span> <span class="o">(</span><span class="n">infinite</span> <span class="o">&amp;&amp;</span> <span class="o">!</span><span class="n">subscriber</span><span class="o">.</span><span class="na">isUnsubscribed</span><span class="o">());</span> <span class="o">}</span> <span class="o">}.</span><span class="na">start</span><span class="o">();</span> <span class="o">});</span> <span class="o">}</span> <span class="kd">private</span> <span class="kd">static</span> <span class="kt">void</span> <span class="nf">pause</span><span class="o">(</span><span class="kt">long</span> <span class="n">millis</span><span class="o">)</span> <span class="o">{</span> <span class="k">try</span> <span class="o">{</span> <span class="n">Thread</span><span class="o">.</span><span class="na">sleep</span><span class="o">(</span><span class="n">millis</span><span class="o">);</span> <span class="o">}</span> <span class="k">catch</span> <span class="o">(</span><span class="n">InterruptedException</span> <span class="n">error</span><span class="o">)</span> <span class="o">{</span> <span class="n">Throwables</span><span class="o">.</span><span class="na">propagate</span><span class="o">(</span><span class="n">error</span><span class="o">);</span> <span class="o">}</span> <span class="o">}</span> <span class="o">}</span></code></pre> <p>What’s going on really here? The fast producer is an observable emitting an item every 100ms and then incrementing the <code>pendingTaskCount</code>. Subsequently, the emitted item is <code>flatMap</code>ed into another consumer observable emitting an item every 300ms and then decrementing the <code>pendingTaskCount</code>. That is, yet another simple producer-consumer pipeline. Finally, we ask for the first 5 items emitted out of the pipeline. Can you guess the program output? Or let me rephrase the question: Do you expect <code>pendingTaskCount</code> to be non-zero? Unfortunately, yes. It is 3 in this case. Let’s shed some more light into it:</p> <p><img src="prod-cons-pipeline.jpg" alt="Producer-Consumer Pipeline"></p> <p>As my spectular drawing skills depict above, during the completion of the final 5th item, the producer generates 3 other items which later on get processed by the slow consumer. So you have 3 extra threads lingering in the background hogging both memory and processing resources. (Why 3? Because <code>consumePeriod / producePeriod = 3</code>.) While 3 seems like an innocent and hence negligible magnitude, this speed unalignment can get a lot more worse once you deploy the application to production. (Yes, it did in our case at work.) What do I exactly mean by worse? <em>If we would set <code>consumePeriod</code> to 10s, and <code>producePeriod</code> to 10ms, then there will be 1000 threads running in the background at any particular point in time!</em></p> <h1 id="rx-has-a-word-to-say">Rx Has a Word To Say!</h1> <p>In a nutshell, we need to come up with a way to regulate the production pace in line with the consumption. We can either do this by an on-demand producer (<em>reactive pull</em>) or blocking the producer itself (<em>callstack blocking</em>). (Both in its <a href="https://github.com/ReactiveX/RxJava/wiki/Backpressure">official wiki</a> and <a href="http://stackoverflow.com/documentation/rx-java/2341/backpressure">Stack Overflow Documentation</a>, RxJava has quite some juice on the subject.)</p> <h2 id="discarding-the-over-production">Discarding the Over-Production</h2> <p>Three common methods provided out of the box by RxJava for dealing with back-pressure are <code>onBackpressureBuffer</code>, <code>onBackpressureDrop</code>, and <code>onBackpressureLatest</code>. While they definitely do the trick, rather than regulating the production speed, they just discard emitted items by the producer under certain back-pressure circumstances. (I am keeping experimental RxJava &gt;1.0 feature <code>onBackpressureBlock</code> out of this discussion due to its ambiguous future and known track record of holding a potential to introduce dead-locks.)</p> <h2 id="reactive-pull">Reactive Pull</h2> <p>RxJava has one more bullet in the hand though: <a href="http://stackoverflow.com/documentation/rxjava/2341/backpressure">SyncOnSubscribe</a>. This almost orphan, totally undocumented prodigy, provides the necessary harness to create <em>stateful</em> and <em>on-demand</em> producers:</p> <pre><code class="language-java"><span class="n">SyncOnSubscribe</span><span class="o">&lt;</span><span class="n">Integer</span><span class="o">,</span> <span class="n">InputStream</span><span class="o">&gt;</span> <span class="n">binaryReader</span> <span class="o">=</span> <span class="n">SyncOnSubscribe</span><span class="o">.</span><span class="na">createStateful</span><span class="o">(</span> <span class="c1">// Create the initial state. (Invoked per subscriber.)</span> <span class="o">()</span> <span class="o">-&gt;</span> <span class="k">new</span> <span class="nf">FileInputStream</span><span class="o">(</span><span class="s">"data.bin"</span><span class="o">),</span> <span class="c1">// Upon request, emit a new item and return the new state.</span> <span class="o">(</span><span class="n">inputStream</span><span class="o">,</span> <span class="n">output</span><span class="o">)</span> <span class="o">-&gt;</span> <span class="o">{</span> <span class="k">try</span> <span class="o">{</span> <span class="kt">int</span> <span class="kt">byte</span> <span class="o">=</span> <span class="n">inputStream</span><span class="o">.</span><span class="na">read</span><span class="o">();</span> <span class="k">if</span> <span class="o">(</span><span class="kt">byte</span> <span class="o">&lt;</span> <span class="mi">0</span><span class="o">)</span> <span class="n">output</span><span class="o">.</span><span class="na">onCompleted</span><span class="o">()</span> <span class="k">else</span> <span class="n">output</span><span class="o">.</span><span class="na">onNext</span><span class="o">(</span><span class="kt">byte</span><span class="o">);</span> <span class="o">}</span> <span class="k">catch</span> <span class="o">(</span><span class="n">IOException</span> <span class="n">ex</span><span class="o">)</span> <span class="o">{</span> <span class="n">output</span><span class="o">.</span><span class="na">onError</span><span class="o">(</span><span class="n">ex</span><span class="o">);</span> <span class="o">}</span> <span class="k">return</span> <span class="n">inputStream</span><span class="o">;</span> <span class="o">},</span> <span class="c1">// Perform final clean-up using the state. (Invoked upon unsubscription.)</span> <span class="n">inputStream</span> <span class="o">-&gt;</span> <span class="o">{</span> <span class="k">try</span> <span class="o">{</span> <span class="n">inputStream</span><span class="o">.</span><span class="na">close</span><span class="o">();</span> <span class="o">}</span> <span class="k">catch</span> <span class="o">(</span><span class="n">IOException</span> <span class="n">error</span><span class="o">)</span> <span class="o">{</span> <span class="n">RxJavaHooks</span><span class="o">.</span><span class="na">onError</span><span class="o">(</span><span class="n">error</span><span class="o">);</span> <span class="o">}</span> <span class="o">}</span> <span class="o">);</span> <span class="n">Observable</span><span class="o">&lt;</span><span class="n">Integer</span><span class="o">&gt;</span> <span class="n">observableBinaryReader</span> <span class="o">=</span> <span class="n">Observable</span><span class="o">.</span><span class="na">create</span><span class="o">(</span><span class="n">binaryReader</span><span class="o">);</span></code></pre> <p>Awesome! We are done, right? Unfortunately not. In RxJava, unless you specify otherwise, <a href="http://reactivex.io/RxJava/javadoc/rx/Subscriber.html#request(long)">every consumer tries to pull <code>Long.MAX_VALUE</code> items from the observable it is subscribed to</a>. You can change this beaviour by overriding this value:</p> <pre><code class="language-java"><span class="n">observableBinaryReader</span><span class="o">.</span><span class="na">subscribe</span><span class="o">(</span><span class="k">new</span> <span class="n">Subscriber</span><span class="o">&lt;</span><span class="n">Integer</span><span class="o">&gt;()</span> <span class="o">{</span> <span class="nd">@Override</span> <span class="kd">public</span> <span class="kt">void</span> <span class="nf">onStart</span><span class="o">()</span> <span class="o">{</span> <span class="n">request</span><span class="o">(</span><span class="mi">1</span><span class="o">);</span> <span class="c1">// Request 1 item on start up.</span> <span class="o">}</span> <span class="kd">public</span> <span class="kt">void</span> <span class="nf">onNext</span><span class="o">(</span><span class="n">Integer</span> <span class="n">v</span><span class="o">)</span> <span class="o">{</span> <span class="n">compute</span><span class="o">(</span><span class="n">v</span><span class="o">);</span> <span class="n">request</span><span class="o">(</span><span class="mi">1</span><span class="o">);</span> <span class="c1">// Request a new item after consuming one.</span> <span class="o">}</span> <span class="nd">@Override</span> <span class="kd">public</span> <span class="kt">void</span> <span class="nf">onError</span><span class="o">(</span><span class="n">Throwable</span> <span class="n">error</span><span class="o">)</span> <span class="o">{</span> <span class="n">error</span><span class="o">.</span><span class="na">printStackTrace</span><span class="o">();</span> <span class="o">}</span> <span class="nd">@Override</span> <span class="kd">public</span> <span class="kt">void</span> <span class="nf">onCompleted</span><span class="o">()</span> <span class="o">{</span> <span class="n">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">println</span><span class="o">(</span><span class="s">"Done!"</span><span class="o">);</span> <span class="o">}</span> <span class="o">});</span></code></pre> <p>In other words, the subscriber needs to be aware of the producer-consumer pace mismatch and align them explicitly by limiting the number of requested items. To the best of my knowledge, it is not possible to enforce the subscriber to specify the number of requested items. You just need to hope that the next programmer consuming your <code>Observable&lt;T&gt;</code> will be able to figure out the back-pressure problem and override the <code>request(Long.MAX_VALUE)</code> behaviour. (But you know that he won’t, right?)</p> <p>As a matter of fact, <em>reactive pull</em> does not provide a solution for our over-productive observable example, which just blindly emits items by ignoring the consumer pace. We need a way to block the production according to the consumption rate. And Rx literature has already got a term for this approach: <em>Callstack Blocking</em>.</p> <h2 id="callstack-blocking">Callstack Blocking</h2> <p>Shamelessly copying from the <a href="https://github.com/ReactiveX/RxJava/wiki/Backpressure#callstack-blocking-as-a-flow-control-alternative-to-backpressure">RxJava wiki</a>:</p> <blockquote> <p>Another way of handling an over-productive <code>Observable</code> is to block the callstack (parking the thread that governs the over-productive <code>Observable</code>). This has the disadvantage of going against the <em>reactive</em> and non-blocking model of Rx. However this can be a viable option if the problematic <code>Observable</code> is on a thread that can be blocked safely. Currently RxJava does not expose any operators to facilitate this.</p> </blockquote> <p>But the good news is, you can implement this yourself. Let me walk-through you how to do it.</p> <h1 id="stack-your-own-back-pressure">Stack Your Own Back-Pressure</h1> <p>Let me introduce you to the poor man’s back-pressure queue.</p> <pre><code class="language-java"><span class="kd">public</span> <span class="kd">static</span> <span class="kt">void</span> <span class="nf">main</span><span class="o">(</span><span class="n">String</span><span class="o">[]</span> <span class="n">args</span><span class="o">)</span> <span class="o">{</span> <span class="kt">long</span> <span class="n">producePeriod</span> <span class="o">=</span> <span class="mi">100</span><span class="o">;</span> <span class="kt">long</span> <span class="n">consumePeriod</span> <span class="o">=</span> <span class="mi">300</span><span class="o">;</span> <span class="n">AtomicInteger</span> <span class="n">pendingTaskCount</span> <span class="o">=</span> <span class="k">new</span> <span class="nf">AtomicInteger</span><span class="o">();</span> <span class="c1">// The token queue for producer-consumer pipeline.</span> <span class="n">BlockingQueue</span><span class="o">&lt;</span><span class="n">Integer</span><span class="o">&gt;</span> <span class="n">tokens</span> <span class="o">=</span> <span class="k">new</span> <span class="n">ArrayBlockingQueue</span><span class="o">&lt;&gt;(</span> <span class="mi">1</span><span class="o">,</span> <span class="c1">// Number of tokens allowed.</span> <span class="kc">false</span><span class="o">,</span> <span class="c1">// fair? (preserve the FIFO order?)</span> <span class="n">Collections</span><span class="o">.</span><span class="na">singleton</span><span class="o">(</span><span class="mi">1</span><span class="o">));</span> <span class="c1">// Initial tokens.</span> <span class="n">createStream</span><span class="o">(</span><span class="n">producePeriod</span><span class="o">,</span> <span class="kc">true</span><span class="o">,</span> <span class="o">()</span> <span class="o">-&gt;</span> <span class="o">{</span> <span class="n">pendingTaskCount</span><span class="o">.</span><span class="na">incrementAndGet</span><span class="o">();</span> <span class="c1">// Try to acquire a token from the queue.</span> <span class="k">try</span> <span class="o">{</span> <span class="k">return</span> <span class="n">tokens</span><span class="o">.</span><span class="na">take</span><span class="o">();</span> <span class="o">}</span> <span class="k">catch</span> <span class="o">(</span><span class="n">InterruptedException</span> <span class="n">error</span><span class="o">)</span> <span class="o">{</span> <span class="k">throw</span> <span class="n">Throwables</span><span class="o">.</span><span class="na">propagate</span><span class="o">(</span><span class="n">error</span><span class="o">);</span> <span class="o">}</span> <span class="o">})</span> <span class="o">.</span><span class="na">flatMap</span><span class="o">(</span><span class="n">token</span> <span class="o">-&gt;</span> <span class="n">createStream</span><span class="o">(</span><span class="n">consumePeriod</span><span class="o">,</span> <span class="kc">false</span><span class="o">,</span> <span class="o">()</span> <span class="o">-&gt;</span> <span class="o">{</span> <span class="n">pendingTaskCount</span><span class="o">.</span><span class="na">decrementAndGet</span><span class="o">();</span> <span class="c1">// Push the token back into the queue.</span> <span class="k">try</span> <span class="o">{</span> <span class="n">tokens</span><span class="o">.</span><span class="na">put</span><span class="o">(</span><span class="n">token</span><span class="o">);</span> <span class="o">}</span> <span class="k">catch</span> <span class="o">(</span><span class="n">InterruptedException</span> <span class="n">error</span><span class="o">)</span> <span class="o">{</span> <span class="k">throw</span> <span class="n">Throwables</span><span class="o">.</span><span class="na">propagate</span><span class="o">(</span><span class="n">error</span><span class="o">);</span> <span class="o">}</span> <span class="k">return</span> <span class="kc">null</span><span class="o">;</span> <span class="o">}))</span> <span class="o">.</span><span class="na">take</span><span class="o">(</span><span class="mi">5</span><span class="o">)</span> <span class="o">.</span><span class="na">toBlocking</span><span class="o">()</span> <span class="o">.</span><span class="na">last</span><span class="o">();</span> <span class="n">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">format</span><span class="o">(</span><span class="s">"pending task count: %d\n"</span><span class="o">,</span> <span class="n">pendingTaskCount</span><span class="o">.</span><span class="na">get</span><span class="o">());</span> <span class="o">}</span></code></pre> <p>Here we use a blocking queue to implement a token storage where producers acquire from and consumers release to. This way we create a way to communicate the back-pressure from consumers to the producer. Initially there is just a single token. Producer acquires this token and emits an item. Note that the upcoming producer call of the thread will block since there are no tokens left in the queue. Next, consumer emits an item and releases the token back into the queue. Now the blocked thread can proceed and emit a new item and so on. By limiting the number of tokens initially available within the queue, we put an upper limit on the number of concurrent consumptions. This version of our producer-consumer pipeline reports that <code>pendingTaskCount</code> is 1, which is independent of the producer/consumer speed mismatch.</p> <h1 id="back-pressure-for-the-masses">Back-Pressure for the Masses</h1> <p>Can we avoid having a global reference to the token storage and make it explicit in the return type of the observable signature? Consider the following two interfaces:</p> <pre><code class="language-java"><span class="kd">public</span> <span class="kd">interface</span> <span class="nc">BackPressuredFactory</span> <span class="o">{</span> <span class="nd">@Nonnull</span> <span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span> <span class="n">BackPressured</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span> <span class="nf">acquire</span><span class="o">(</span><span class="nd">@Nullable</span> <span class="n">T</span> <span class="n">instance</span><span class="o">);</span> <span class="o">}</span> <span class="kd">public</span> <span class="kd">interface</span> <span class="nc">BackPressured</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span> <span class="o">{</span> <span class="nd">@Nullable</span> <span class="n">T</span> <span class="nf">getValue</span><span class="o">();</span> <span class="kt">void</span> <span class="nf">release</span><span class="o">();</span> <span class="o">}</span></code></pre> <p>A factory for creating instances of <code>BackPressured&lt;T&gt;</code>, which encapsulates a value associated with a certain token that is supposed to be released. Let’s try to put them into use:</p> <pre><code class="language-java"><span class="kd">public</span> <span class="kd">static</span> <span class="kt">void</span> <span class="nf">main</span><span class="o">(</span><span class="n">String</span><span class="o">[]</span> <span class="n">args</span><span class="o">)</span> <span class="o">{</span> <span class="kt">long</span> <span class="n">producePeriod</span> <span class="o">=</span> <span class="mi">100</span><span class="o">;</span> <span class="kt">long</span> <span class="n">consumePeriod</span> <span class="o">=</span> <span class="mi">300</span><span class="o">;</span> <span class="n">AtomicInteger</span> <span class="n">pendingTaskCount</span> <span class="o">=</span> <span class="k">new</span> <span class="nf">AtomicInteger</span><span class="o">();</span> <span class="n">BackPressuredFactory</span> <span class="n">backPressuredFactory</span> <span class="o">=</span> <span class="k">new</span> <span class="nf">BackPressuredFactoryImpl</span><span class="o">(</span> <span class="mi">1</span><span class="o">,</span> <span class="c1">// Number of concurrent tokens allowed.</span> <span class="mi">5000</span><span class="o">);</span> <span class="c1">// Max. acquire/release timeout in milliseconds.</span> <span class="n">createStream</span><span class="o">(</span><span class="n">producePeriod</span><span class="o">,</span> <span class="kc">true</span><span class="o">,</span> <span class="o">()</span> <span class="o">-&gt;</span> <span class="o">{</span> <span class="n">pendingTaskCount</span><span class="o">.</span><span class="na">incrementAndGet</span><span class="o">();</span> <span class="c1">// Wrap the next item with a BackPressured&lt;T&gt; instance.</span> <span class="n">BackPressured</span><span class="o">&lt;</span><span class="n">Void</span><span class="o">&gt;</span> <span class="n">next</span> <span class="o">=</span> <span class="n">backPressuredFactory</span><span class="o">.</span><span class="na">acquire</span><span class="o">(</span><span class="kc">null</span><span class="o">);</span> <span class="k">return</span> <span class="n">next</span><span class="o">;</span> <span class="o">})</span> <span class="o">.</span><span class="na">flatMap</span><span class="o">(</span><span class="n">backPressuredToken</span> <span class="o">-&gt;</span> <span class="n">createStream</span><span class="o">(</span><span class="n">consumePeriod</span><span class="o">,</span> <span class="kc">false</span><span class="o">,</span> <span class="o">()</span> <span class="o">-&gt;</span> <span class="o">{</span> <span class="k">try</span> <span class="o">{</span> <span class="n">pendingTaskCount</span><span class="o">.</span><span class="na">decrementAndGet</span><span class="o">();</span> <span class="c1">// Getting the value out of the back-pressured token.</span> <span class="k">return</span> <span class="n">backPressuredToken</span><span class="o">.</span><span class="na">getValue</span><span class="o">();</span> <span class="o">}</span> <span class="k">finally</span> <span class="o">{</span> <span class="c1">// Release the token.</span> <span class="n">backPressuredToken</span><span class="o">.</span><span class="na">release</span><span class="o">();</span> <span class="o">}</span> <span class="o">}))</span> <span class="o">.</span><span class="na">take</span><span class="o">(</span><span class="mi">5</span><span class="o">)</span> <span class="o">.</span><span class="na">toBlocking</span><span class="o">()</span> <span class="o">.</span><span class="na">last</span><span class="o">();</span> <span class="n">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">format</span><span class="o">(</span><span class="s">"pending task count: %d\n"</span><span class="o">,</span> <span class="n">pendingTaskCount</span><span class="o">.</span><span class="na">get</span><span class="o">());</span> <span class="o">}</span></code></pre> <p>In a nutshell, we encapsulate every item of type <code>T</code> that producer emits into a <code>BackPressured&lt;T&gt;</code> instance. <code>BackPressuredFactory</code> contains the token storage. Given these requirements a sample implementation of these interfaces can be given as follows:</p> <pre><code class="language-java"><span class="kn">import</span> <span class="nn">org.slf4j.Logger</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">org.slf4j.LoggerFactory</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">java.util.List</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">java.util.concurrent.ArrayBlockingQueue</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">java.util.concurrent.BlockingQueue</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">java.util.concurrent.TimeUnit</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">java.util.stream.Collectors</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">java.util.stream.IntStream</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">static</span> <span class="n">com</span><span class="o">.</span><span class="na">google</span><span class="o">.</span><span class="na">common</span><span class="o">.</span><span class="na">base</span><span class="o">.</span><span class="na">Preconditions</span><span class="o">.</span><span class="na">checkArgument</span><span class="o">;</span> <span class="kd">public</span> <span class="kd">class</span> <span class="nc">BackPressuredFactoryImpl</span> <span class="kd">implements</span> <span class="n">BackPressuredFactory</span> <span class="o">{</span> <span class="kd">private</span> <span class="kd">static</span> <span class="kd">final</span> <span class="n">Logger</span> <span class="n">LOGGER</span> <span class="o">=</span> <span class="n">LoggerFactory</span><span class="o">.</span><span class="na">getLogger</span><span class="o">(</span><span class="n">BackPressuredFactoryImpl</span><span class="o">.</span><span class="na">class</span><span class="o">);</span> <span class="kd">private</span> <span class="kd">final</span> <span class="n">BlockingQueue</span><span class="o">&lt;</span><span class="n">Integer</span><span class="o">&gt;</span> <span class="n">tokens</span><span class="o">;</span> <span class="kd">private</span> <span class="kd">final</span> <span class="kt">long</span> <span class="n">timeoutMillis</span><span class="o">;</span> <span class="kd">public</span> <span class="nf">BackPressuredFactoryImpl</span><span class="o">(</span><span class="kt">int</span> <span class="n">bufferSize</span><span class="o">,</span> <span class="kt">long</span> <span class="n">timeoutMillis</span><span class="o">)</span> <span class="o">{</span> <span class="n">checkArgument</span><span class="o">(</span><span class="n">bufferSize</span> <span class="o">&gt;</span> <span class="mi">0</span><span class="o">,</span> <span class="s">"bufferSize &gt; 0, found: %d"</span><span class="o">,</span> <span class="n">bufferSize</span><span class="o">);</span> <span class="n">checkArgument</span><span class="o">(</span><span class="n">timeoutMillis</span> <span class="o">&gt;</span> <span class="mi">0</span><span class="o">,</span> <span class="s">"timeoutMillis &gt; 0, found: %d"</span><span class="o">,</span> <span class="n">timeoutMillis</span><span class="o">);</span> <span class="n">List</span><span class="o">&lt;</span><span class="n">Integer</span><span class="o">&gt;</span> <span class="n">initialTokens</span> <span class="o">=</span> <span class="n">IntStream</span><span class="o">.</span><span class="na">range</span><span class="o">(</span><span class="mi">0</span><span class="o">,</span> <span class="n">bufferSize</span><span class="o">).</span><span class="na">boxed</span><span class="o">().</span><span class="na">collect</span><span class="o">(</span><span class="n">Collectors</span><span class="o">.</span><span class="na">toList</span><span class="o">());</span> <span class="k">this</span><span class="o">.</span><span class="na">tokens</span> <span class="o">=</span> <span class="k">new</span> <span class="n">ArrayBlockingQueue</span><span class="o">&lt;&gt;(</span><span class="n">bufferSize</span><span class="o">,</span> <span class="kc">false</span><span class="o">,</span> <span class="n">initialTokens</span><span class="o">);</span> <span class="k">this</span><span class="o">.</span><span class="na">timeoutMillis</span> <span class="o">=</span> <span class="n">timeoutMillis</span><span class="o">;</span> <span class="n">LOGGER</span><span class="o">.</span><span class="na">trace</span><span class="o">(</span><span class="s">"initialized (bufferSize={}, timeoutMillis={})"</span><span class="o">,</span> <span class="n">bufferSize</span><span class="o">,</span> <span class="n">timeoutMillis</span><span class="o">);</span> <span class="o">}</span> <span class="nd">@Nonnull</span> <span class="nd">@Override</span> <span class="kd">public</span> <span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span> <span class="n">BackPressured</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span> <span class="nf">acquire</span><span class="o">(</span><span class="nd">@Nullable</span> <span class="n">T</span> <span class="n">value</span><span class="o">)</span> <span class="o">{</span> <span class="n">LOGGER</span><span class="o">.</span><span class="na">trace</span><span class="o">(</span><span class="s">"acquiring (peekedToken={})"</span><span class="o">,</span> <span class="n">tokens</span><span class="o">.</span><span class="na">peek</span><span class="o">());</span> <span class="k">try</span> <span class="o">{</span> <span class="n">Integer</span> <span class="n">token</span> <span class="o">=</span> <span class="n">tokens</span><span class="o">.</span><span class="na">poll</span><span class="o">(</span><span class="n">timeoutMillis</span><span class="o">,</span> <span class="n">TimeUnit</span><span class="o">.</span><span class="na">MILLISECONDS</span><span class="o">);</span> <span class="k">if</span> <span class="o">(</span><span class="n">token</span> <span class="o">==</span> <span class="kc">null</span><span class="o">)</span> <span class="k">throw</span> <span class="k">new</span> <span class="nf">RuntimeException</span><span class="o">(</span><span class="s">"token acquisition timeout"</span><span class="o">);</span> <span class="k">return</span> <span class="k">new</span> <span class="n">BackPressuredImpl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;(</span><span class="n">tokens</span><span class="o">,</span> <span class="n">timeoutMillis</span><span class="o">,</span> <span class="n">token</span><span class="o">,</span> <span class="n">value</span><span class="o">);</span> <span class="o">}</span> <span class="k">catch</span> <span class="o">(</span><span class="n">InterruptedException</span> <span class="n">error</span><span class="o">)</span> <span class="o">{</span> <span class="k">throw</span> <span class="k">new</span> <span class="nf">RuntimeException</span><span class="o">(</span><span class="s">"token acquisition failure"</span><span class="o">,</span> <span class="n">error</span><span class="o">);</span> <span class="o">}</span> <span class="o">}</span> <span class="o">}</span></code></pre> <p>And here is <code>BackPressured&lt;T&gt;</code>:</p> <pre><code class="language-java"><span class="kn">import</span> <span class="nn">org.slf4j.Logger</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">org.slf4j.LoggerFactory</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">java.util.concurrent.BlockingQueue</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">java.util.concurrent.TimeUnit</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">static</span> <span class="n">com</span><span class="o">.</span><span class="na">google</span><span class="o">.</span><span class="na">common</span><span class="o">.</span><span class="na">base</span><span class="o">.</span><span class="na">Preconditions</span><span class="o">.</span><span class="na">checkArgument</span><span class="o">;</span> <span class="kn">import</span> <span class="nn">static</span> <span class="n">com</span><span class="o">.</span><span class="na">google</span><span class="o">.</span><span class="na">common</span><span class="o">.</span><span class="na">base</span><span class="o">.</span><span class="na">Preconditions</span><span class="o">.</span><span class="na">checkNotNull</span><span class="o">;</span> <span class="kd">public</span> <span class="kd">class</span> <span class="nc">BackPressuredImpl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span> <span class="kd">implements</span> <span class="n">BackPressured</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span> <span class="o">{</span> <span class="kd">private</span> <span class="kd">static</span> <span class="kd">final</span> <span class="n">Logger</span> <span class="n">LOGGER</span> <span class="o">=</span> <span class="n">LoggerFactory</span><span class="o">.</span><span class="na">getLogger</span><span class="o">(</span><span class="n">BackPressuredImpl</span><span class="o">.</span><span class="na">class</span><span class="o">);</span> <span class="kd">private</span> <span class="kd">final</span> <span class="n">BlockingQueue</span><span class="o">&lt;</span><span class="n">Integer</span><span class="o">&gt;</span> <span class="n">tokens</span><span class="o">;</span> <span class="kd">private</span> <span class="kd">final</span> <span class="kt">long</span> <span class="n">timeoutMillis</span><span class="o">;</span> <span class="kd">private</span> <span class="kd">final</span> <span class="kt">int</span> <span class="n">token</span><span class="o">;</span> <span class="kd">private</span> <span class="kd">final</span> <span class="n">T</span> <span class="n">value</span><span class="o">;</span> <span class="kd">public</span> <span class="nf">BackPressuredImpl</span><span class="o">(</span><span class="nd">@Nonnull</span> <span class="n">BlockingQueue</span><span class="o">&lt;</span><span class="n">Integer</span><span class="o">&gt;</span> <span class="n">tokens</span><span class="o">,</span> <span class="kt">long</span> <span class="n">timeoutMillis</span><span class="o">,</span> <span class="kt">int</span> <span class="n">token</span><span class="o">,</span> <span class="nd">@Nullable</span> <span class="n">T</span> <span class="n">value</span><span class="o">)</span> <span class="o">{</span> <span class="n">checkArgument</span><span class="o">(</span><span class="n">timeoutMillis</span> <span class="o">&gt;</span> <span class="mi">0</span><span class="o">,</span> <span class="s">"timeoutMillis &gt; 0, found: %d"</span><span class="o">,</span> <span class="n">timeoutMillis</span><span class="o">);</span> <span class="k">this</span><span class="o">.</span><span class="na">tokens</span> <span class="o">=</span> <span class="n">checkNotNull</span><span class="o">(</span><span class="n">tokens</span><span class="o">,</span> <span class="s">"null tokens"</span><span class="o">);</span> <span class="k">this</span><span class="o">.</span><span class="na">timeoutMillis</span> <span class="o">=</span> <span class="n">timeoutMillis</span><span class="o">;</span> <span class="k">this</span><span class="o">.</span><span class="na">token</span> <span class="o">=</span> <span class="n">token</span><span class="o">;</span> <span class="k">this</span><span class="o">.</span><span class="na">value</span> <span class="o">=</span> <span class="n">value</span><span class="o">;</span> <span class="n">LOGGER</span><span class="o">.</span><span class="na">trace</span><span class="o">(</span><span class="s">"initialized (token={})"</span><span class="o">,</span> <span class="n">token</span><span class="o">,</span> <span class="n">value</span><span class="o">);</span> <span class="o">}</span> <span class="nd">@Nullable</span> <span class="nd">@Override</span> <span class="kd">public</span> <span class="n">T</span> <span class="nf">getValue</span><span class="o">()</span> <span class="o">{</span> <span class="k">return</span> <span class="n">value</span><span class="o">;</span> <span class="o">}</span> <span class="nd">@Override</span> <span class="kd">public</span> <span class="kt">void</span> <span class="nf">release</span><span class="o">()</span> <span class="o">{</span> <span class="n">LOGGER</span><span class="o">.</span><span class="na">trace</span><span class="o">(</span><span class="s">"releasing (token={})"</span><span class="o">,</span> <span class="n">token</span><span class="o">);</span> <span class="k">try</span> <span class="o">{</span> <span class="k">if</span> <span class="o">(!</span><span class="n">tokens</span><span class="o">.</span><span class="na">offer</span><span class="o">(</span><span class="n">token</span><span class="o">,</span> <span class="n">timeoutMillis</span><span class="o">,</span> <span class="n">TimeUnit</span><span class="o">.</span><span class="na">MILLISECONDS</span><span class="o">))</span> <span class="o">{</span> <span class="n">String</span> <span class="n">message</span> <span class="o">=</span> <span class="n">String</span><span class="o">.</span><span class="na">format</span><span class="o">(</span><span class="s">"token release timeout (timeoutMillis=%d, token=%d)"</span><span class="o">,</span> <span class="n">timeoutMillis</span><span class="o">,</span> <span class="n">token</span><span class="o">);</span> <span class="k">throw</span> <span class="k">new</span> <span class="nf">RuntimeException</span><span class="o">(</span><span class="n">message</span><span class="o">);</span> <span class="o">}</span> <span class="n">LOGGER</span><span class="o">.</span><span class="na">trace</span><span class="o">(</span><span class="s">"released (token={})"</span><span class="o">,</span> <span class="n">token</span><span class="o">);</span> <span class="o">}</span> <span class="k">catch</span> <span class="o">(</span><span class="n">InterruptedException</span> <span class="n">error</span><span class="o">)</span> <span class="o">{</span> <span class="n">String</span> <span class="n">message</span> <span class="o">=</span> <span class="n">String</span><span class="o">.</span><span class="na">format</span><span class="o">(</span><span class="s">"token release failure (timeoutMillis=%d, token=%d)"</span><span class="o">,</span> <span class="n">timeoutMillis</span><span class="o">,</span> <span class="n">token</span><span class="o">);</span> <span class="k">throw</span> <span class="k">new</span> <span class="nf">RuntimeException</span><span class="o">(</span><span class="n">message</span><span class="o">,</span> <span class="n">error</span><span class="o">);</span> <span class="o">}</span> <span class="o">}</span> <span class="o">}</span></code></pre> <h1 id="conclusion">Conclusion</h1> <p>Back-pressure is a significant aspect in every producer-consumer pipeline. It can be easily overlooked and holds a potential to break the system depending on the speed mismatch of the involved actors. In this post, I examined the problem in a sample RxJava application and provided a solution leveraging <em>callback blocking</em> approach that can be employed in almost any domain where the back-pressure needs to communicated. I hope you find it useful as well.</p> tag:vlkan.com,2016-03-02://blog/post/2016/03/02/optimize-dc/ Optimizing a Data Center Using Integer Programming 2016-03-02T19:32:00Z 2016-03-02T19:32:00Z <p><a href="https://hashcode.withgoogle.com/2015/tasks/hashcode2015_qualification_task.pdf">Optimize a Data Center</a> is a <strong>challenging</strong> programming problem presented in the Qualification Round of the <a href="https://hashcode.withgoogle.com/past_editions.html">Hash Code 2015</a>. Among 230 teams, <em>What’s in a name?</em> from <a href="http://www.ens.fr/">École normale supérieure</a> ranked first in the qualification, but third in the final round. By challenging, I mean it is not possible to come up with a deterministic polynomial-time optimal answer. I am not in a position to either provide a rigorous proof of its complexity or its reduction to a known NP-hard problem. But in this blog post I will investigate the following question: Can we provide an optimal solution using <a href="https://en.wikipedia.org/wiki/Integer_programming">integer programming</a>? In practice, that would allow us to come up with an optimal solution to small-sized problems. Without further ado, let’s start with the problem definition.</p> <h1 id="the-problem">The Problem</h1> <p>Here, I will present a brief summary of <a href="https://hashcode.withgoogle.com/2015/tasks/hashcode2015_qualification_task.pdf">the actual problem</a>. A data center is modeled as <strong>rows​</strong> of <strong>slots</strong> ​in which servers can be placed. And some of the slots are known to be <strong>unavailable</strong>.</p> <p><img src="rows.jpg" alt="Data center rows"></p> <p>Each <strong>server</strong> is characterized by its <strong>size</strong> and <strong>capacity​</strong>. Size is the number of consecutive slots occupied by the machine. Capacity is the total amount of CPU resources of the machine (an integer value).</p> <p><img src="servers.jpg" alt="Data center servers"></p> <p>Servers in a data center are also logically divided into <strong>pools</strong>. ​Each server belongs to exactly one pool. The capacity of a pool is the sum of the capacities of the available ​servers in that pool.</p> <p>The <strong>guaranteed capacity</strong> ​of a pool is the minimum capacity it will have when at most one data center row goes down. Given a schema of a data center and a list of available servers, the goal is to assign servers to slots within the rows​ and to logical pools ​so that the lowest guaranteed capacity​ of all pools is maximized.</p> <p>Consider the following data center schema and a list of available servers. For simplicity, it is assumed that server capacities are equal to server sizes.</p> <p><img src="example-prob.jpg" alt="Example problem"></p> <p>Following layout is a solution to the above given problem. Here different pools are denoted in distinct colors.</p> <p><img src="example-soln.jpg" alt="Example solution"></p> <h1 id="preliminaries">Preliminaries</h1> <p>Before modeling the IP (Integer Program), I will start with stating the problem input.</p> <ul> <li> <script type="math/tex">R</script> ​denotes the number of rows in the data center</li> <li> <script type="math/tex">S</script> denotes the number of slots in each row of the data center</li> <li> <script type="math/tex">U \leq RS</script> denotes the number of unavailable slots</li> <li> <script type="math/tex">P</script> denotes the number of pools to be created</li> <li> <script type="math/tex">M \leq RS</script> denotes the number of servers to be allocated</li> <li> <script type="math/tex">z_k</script> and <script type="math/tex">c_k</script> denote <script type="math/tex">k</script>th server’s respectively size and capacity</li> <li> <script type="math/tex">r_i</script> and <script type="math/tex">s_i</script> denote <script type="math/tex">i</script>th unavailable slot’s respectively row and slot indices</li> </ul> <p>While modeling the formulation, I will need to provide constraints to avoid placing servers to unavailable slots. Rather than doing that, <a href="http://kaygun.tumblr.com/">Atabey Kaygun</a> hinted me to represent the data in <strong>blocks</strong>. That is, instead of <script type="math/tex">U</script>, <script type="math/tex">r_i</script>, and <script type="math/tex">s_i</script>, I will transform this data into a single lookup table called <script type="math/tex">z(i, j)</script> that denotes the size of the available slots at <script type="math/tex">i</script>th row and <script type="math/tex">j</script>th block. For instance, consider the following layout:</p> <p><img src="blocks.jpg" alt="Example blocks"></p> <p>Here blocks will look as follows:</p> <script type="math/tex; mode=display">% <![CDATA[ \begin{align} [z(0, 0)] &= [10] \\ [z(1, 0), z(1, 1)] &= [5,3] \\ [z(2, 0), z(2, 1), z(2, 2)] &= [3, 4, 1] \\ [z(3, 0)] &= [6] \end{align} %]]></script> <h1 id="the-integer-programming-model">The Integer Programming Model</h1> <p>For simplicity, I will adopt the following index notation:</p> <ul> <li> <script type="math/tex">i</script> denotes row indices (<script type="math/tex">0 \leq i \leq R</script>)</li> <li> <script type="math/tex">j</script> denotes block (not slot!) indices (varies per row)</li> <li> <script type="math/tex">k</script> denotes server indices (<script type="math/tex">0 \leq k \leq M</script>)</li> <li> <script type="math/tex">\ell</script> denotes pool indices (<script type="math/tex">0 \leq \ell \leq P</script>)</li> </ul> <p>Given that <script type="math/tex">z(i, j)</script> denotes the available blocks, the IP model can be defined as follows:</p> <script type="math/tex; mode=display">% <![CDATA[ \begin{align} \text{maximize} & \quad \min_\ell g(\ell) \quad \text{(minimum of pool guaranteed capacities)} \\ \text{subject to} & \quad \sum_\ell a(i, j, k, \ell) \leq 1, \, \forall k \quad \text{(a server can be assigned to at most 1 pool and block)} \\ & \quad \sum_{k,\ell} z_k \, a(i, j, k, \ell) \leq z(i, j), \, \forall i, j \quad \text{(total size of servers within a block cannot exceed block's size)} \\ \text{where} & \quad a(i, j, k, \ell) = \begin{cases} 1 & \quad \text{if } k \text{th server is assigned to } (i, j) \text{th block and } \ell \text{th pool} \\ 0 & \quad \text{otherwise} \\ \end{cases} \\ & \quad g(\ell) = \min_i g(\ell, i) \quad \text{(guaranteed capacity of } \ell \text{th pool)} \\ & \quad g(\ell, i) = \sum_{i', j, k} c_k \, a(i', j, k, \ell) - \sum_{j, k} c_k \, a(i, j, k, \ell) \quad \text{(guaranteed capacity of } \ell \text{th pool for } i \text{th row)} \\ \end{align} %]]></script> <h1 id="avoiding-minimax-constraint">Avoiding Minimax Constraint</h1> <p>The presented IP model contains a minimax objective, which to the best of my knowledge is not tractable by popular linear programming optimizers, such as <a href="http://www-01.ibm.com/software/commerce/optimization/cplex-optimizer/">CPLEX</a> or <a href="http://lpsolve.sourceforge.net/">lpsolve</a>. But I have a trick in my pocket to tackle that. Let’s assume that we know the optimal objective, say <script type="math/tex">g^*</script>. Then we can model the entire IP as follows:</p> <script type="math/tex; mode=display">% <![CDATA[ \begin{align} \text{maximize} & \quad 1 \quad \text{(a dummy objective)} \\ & \quad \sum_\ell a(i, j, k, \ell) \leq 1, \, \forall k \\ & \quad \sum_{k,\ell} z_k \, a(i, j, k, \ell) \leq z(i, j), \, \forall i, j \\ & \quad g(\ell, i) \geq g^*, \, \forall \ell, i \quad \text{(guaranteed capacity must be greater than or equal to } g^* \text{)} \\ \end{align} %]]></script> <p>What this model states is this: I am not interested in the optimization objective, return me the first found feasible solution. That is, the optimizer will return us the first <script type="math/tex">a(i, j, k, \ell)</script> variable set the moment it finds a feasible solution satisfying <script type="math/tex">g(\ell, i) \geq g^*, \, \forall \ell, i</script> constraints.</p> <p>Now things are getting interesting. If we can find bounds to <script type="math/tex">g^*</script>, than we can use these bounds to bisect the optimal <script type="math/tex">g^*</script>! For the lower bound, we know that <script type="math/tex">g_i = 0 \leq g^*</script>. The upper bound is a little bit tricky, but we can come up with a quite loose bound: <script type="math/tex">g_f = \frac{1}{P} \sum_k c_k \gg g^*</script>. (I will not go into details of how to come up with a stricter upper bound.) So by picking <script type="math/tex">g^* \in (g_i, g_f)</script> we can bisect <strong>the optimal guaranteed capacity</strong>.</p> <h1 id="the-solver">The Solver</h1> <p>For testing purposes, I wrote a simple <a href="https://gist.github.com/vy/9689cb122a84be22d454">Python script</a> that reads an input problem file and calls CPLEX iteratively. A sample output of the script is as follows:</p> <pre><code>$ ./dc.py ./cplex.sh prob/dc-min.in soln/dc-min 0 2016-03-03 08:35:25 DEBUG reading setup: prob/dc-min.in 2016-03-03 08:35:25 DEBUG R=2, S=5, U=1, P=2, M=5 2016-03-03 08:35:25 INFO solving for 0 &lt;= g=8 &lt; 16 2016-03-03 08:35:25 DEBUG writing problem: soln/dc-min-8.lp 2016-03-03 08:35:25 DEBUG running solver 2016-03-03 08:35:25 DEBUG writing cplex output: soln/dc-min-8.out 2016-03-03 08:35:25 DEBUG no solution 2016-03-03 08:35:25 DEBUG stepping back 2016-03-03 08:35:25 INFO solving for 0 &lt;= g=4 &lt; 8 2016-03-03 08:35:25 DEBUG writing problem: soln/dc-min-4.lp 2016-03-03 08:35:25 DEBUG running solver 2016-03-03 08:35:25 DEBUG writing cplex output: soln/dc-min-4.out 2016-03-03 08:35:25 DEBUG solution score: 5 2016-03-03 08:35:25 DEBUG writing solution: soln/dc-min-4.soln 2016-03-03 08:35:25 DEBUG stepping forward 2016-03-03 08:35:25 INFO solving for 5 &lt;= g=6 &lt; 8 2016-03-03 08:35:25 DEBUG writing problem: soln/dc-min-6.lp 2016-03-03 08:35:25 DEBUG running solver 2016-03-03 08:35:25 DEBUG writing cplex output: soln/dc-min-6.out 2016-03-03 08:35:25 DEBUG no solution 2016-03-03 08:35:25 DEBUG stepping back 2016-03-03 08:35:25 INFO solving for 5 &lt;= g=5 &lt; 6 2016-03-03 08:35:25 DEBUG writing problem: soln/dc-min-5.lp 2016-03-03 08:35:25 DEBUG running solver 2016-03-03 08:35:25 DEBUG writing cplex output: soln/dc-min-5.out 2016-03-03 08:35:25 DEBUG solution score: 5 2016-03-03 08:35:25 DEBUG writing solution: soln/dc-min-5.soln 2016-03-03 08:35:25 DEBUG stepping forward </code></pre> <p>It also outputs intermediate IP formulation files, CPLEX output, and solution file. The format of the problem and solution files are detailed in <a href="https://hashcode.withgoogle.com/2015/tasks/hashcode2015_qualification_task.pdf">the official problem description</a>.</p> tag:vlkan.com,2015-11-27://blog/post/2015/11/27/maven-protobuf/ Compiling Protocol Buffers Sources in Maven 2015-11-27T08:49:00Z 2015-11-27T08:49:00Z <blockquote> <p><strong>TL;DR</strong> – This post explains how to compile Protocol Buffers schemas into Java sources in Maven. Over the course of time, there appeared plugins like <a href="https://github.com/os72/protoc-jar-maven-plugin">protoc-jar-maven-plugin</a>. Nevertheless, the steps below still present a value to understand the necessary plumbing and some best practices (e.g., shading) that are not covered by the plugins.</p> </blockquote> <p>Java build systems have always been a second class citizen for <a href="https://developers.google.com/protocol-buffers">Protocol Buffers</a>. As is the case for many other Java serialization frameworks (e.g., <a href="https://capnproto.org/">Cap’n Proto</a>, <a href="http://google.github.io/flatbuffers/">FlatBuffers</a>), Protocol Buffers does not provide a native Java compiler which you can inject into Maven dependencies and invoke a plugin to compile <code>.proto</code> files into <code>.java</code> sources. Hence, programmers needed to have <code>protoc</code> (Proto Buffers Compiler) binary on their development machine and call this platform-dependent binary during Maven build. This totally violates the environment independent build of a project. Fortunately, Google releases platform-specific <code>protoc</code> binaries in the form of Maven artifacts.</p> <p><img src="protoc-artifacts.png" alt="Maven Artifacts for Proto Buffers Compiler Binary"></p> <p>We can add these artifacts as a compile-time dependency to our Maven project and invoke the platform-dependent binary to compile <code>.proto</code> sources.</p> <h1 id="preliminaries">Preliminaries</h1> <p>Let’s start with defining certain properties for library versions and input/output directories for the Protocol Buffers compiler.</p> <pre><code class="language-xml"><span class="nt">&lt;properties&gt;</span> <span class="c">&lt;!-- protobuf paths --&gt;</span> <span class="nt">&lt;protobuf.input.directory&gt;</span>${project.basedir}/src/main/proto<span class="nt">&lt;/protobuf.input.directory&gt;</span> <span class="nt">&lt;protobuf.output.directory&gt;</span>${project.build.directory}/generated-sources<span class="nt">&lt;/protobuf.output.directory&gt;</span> <span class="c">&lt;!-- library versions --&gt;</span> <span class="nt">&lt;build-helper-maven-plugin.version&gt;</span>1.9.1<span class="nt">&lt;/build-helper-maven-plugin.version&gt;</span> <span class="nt">&lt;maven-antrun-plugin.version&gt;</span>1.8<span class="nt">&lt;/maven-antrun-plugin.version&gt;</span> <span class="nt">&lt;maven-dependency-plugin.version&gt;</span>2.10<span class="nt">&lt;/maven-dependency-plugin.version&gt;</span> <span class="nt">&lt;maven-shade-plugin.version&gt;</span>2.4.2<span class="nt">&lt;/maven-shade-plugin.version&gt;</span> <span class="nt">&lt;os-maven-plugin.version&gt;</span>1.4.1.Final<span class="nt">&lt;/os-maven-plugin.version&gt;</span> <span class="nt">&lt;protobuf.version&gt;</span>3.0.0-beta-1<span class="nt">&lt;/protobuf.version&gt;</span> <span class="nt">&lt;/properties&gt;</span></code></pre> <h1 id="protocol-buffers-java-api">Protocol Buffers Java API</h1> <p><code>protoc</code> compiles <code>.proto</code> files into <code>.java</code> files such that the generated sources rely on certain common classes. These classes are provided by <code>protobuf-java</code> artifact:</p> <pre><code class="language-xml"><span class="nt">&lt;dependency&gt;</span> <span class="nt">&lt;groupId&gt;</span>com.google.protobuf<span class="nt">&lt;/groupId&gt;</span> <span class="nt">&lt;artifactId&gt;</span>protobuf-java<span class="nt">&lt;/artifactId&gt;</span> <span class="nt">&lt;version&gt;</span>${protobuf.version}<span class="nt">&lt;/version&gt;</span> <span class="nt">&lt;/dependency&gt;</span></code></pre> <h1 id="detecting-operating-system">Detecting Operating System</h1> <p><code>protoc</code> Maven artifact is provided in various platform-specific <em>classifications</em>: <code>linux-x86_32</code>, <code>linux-x86_64</code>, <code>osx-x86_32</code>, <code>osx-x86_64</code>, <code>windows-x86_32</code>, <code>windows-x86_64</code>. In order to pick the right artifact, we will employ <code>os.detected.classifier</code> property exposed by <code>os-maven-plugin</code>:</p> <pre><code class="language-xml"><span class="nt">&lt;build&gt;</span> <span class="nt">&lt;extensions&gt;</span> <span class="c">&lt;!-- provides os.detected.classifier (i.e. linux-x86_64, osx-x86_64) property --&gt;</span> <span class="nt">&lt;extension&gt;</span> <span class="nt">&lt;groupId&gt;</span>kr.motd.maven<span class="nt">&lt;/groupId&gt;</span> <span class="nt">&lt;artifactId&gt;</span>os-maven-plugin<span class="nt">&lt;/artifactId&gt;</span> <span class="nt">&lt;version&gt;</span>${os-maven-plugin.version}<span class="nt">&lt;/version&gt;</span> <span class="nt">&lt;/extension&gt;</span> <span class="nt">&lt;/extensions&gt;</span> <span class="c">&lt;!-- ... --&gt;</span> <span class="nt">&lt;/build&gt;</span></code></pre> <h1 id="downloading-platform-specific-protocol-buffers-compiler">Downloading Platform-Specific Protocol Buffers Compiler</h1> <p>We will use <code>maven-dependency-plugin</code> to download the platform-specific <code>protoc</code> binary suitable for the current build platform and copy it into <code>project.build.directory</code>.</p> <pre><code class="language-xml"><span class="c">&lt;!-- copy protoc binary into build directory --&gt;</span> <span class="nt">&lt;plugin&gt;</span> <span class="nt">&lt;groupId&gt;</span>org.apache.maven.plugins<span class="nt">&lt;/groupId&gt;</span> <span class="nt">&lt;artifactId&gt;</span>maven-dependency-plugin<span class="nt">&lt;/artifactId&gt;</span> <span class="nt">&lt;version&gt;</span>${maven-dependency-plugin.version}<span class="nt">&lt;/version&gt;</span> <span class="nt">&lt;executions&gt;</span> <span class="nt">&lt;execution&gt;</span> <span class="nt">&lt;id&gt;</span>copy-protoc<span class="nt">&lt;/id&gt;</span> <span class="nt">&lt;phase&gt;</span>generate-sources<span class="nt">&lt;/phase&gt;</span> <span class="nt">&lt;goals&gt;</span> <span class="nt">&lt;goal&gt;</span>copy<span class="nt">&lt;/goal&gt;</span> <span class="nt">&lt;/goals&gt;</span> <span class="nt">&lt;configuration&gt;</span> <span class="nt">&lt;artifactItems&gt;</span> <span class="nt">&lt;artifactItem&gt;</span> <span class="nt">&lt;groupId&gt;</span>com.google.protobuf<span class="nt">&lt;/groupId&gt;</span> <span class="nt">&lt;artifactId&gt;</span>protoc<span class="nt">&lt;/artifactId&gt;</span> <span class="nt">&lt;version&gt;</span>${protobuf.version}<span class="nt">&lt;/version&gt;</span> <span class="nt">&lt;classifier&gt;</span>${os.detected.classifier}<span class="nt">&lt;/classifier&gt;</span> <span class="nt">&lt;type&gt;</span>exe<span class="nt">&lt;/type&gt;</span> <span class="nt">&lt;overWrite&gt;</span>true<span class="nt">&lt;/overWrite&gt;</span> <span class="nt">&lt;outputDirectory&gt;</span>${project.build.directory}<span class="nt">&lt;/outputDirectory&gt;</span> <span class="nt">&lt;/artifactItem&gt;</span> <span class="nt">&lt;/artifactItems&gt;</span> <span class="nt">&lt;/configuration&gt;</span> <span class="nt">&lt;/execution&gt;</span> <span class="nt">&lt;/executions&gt;</span> <span class="nt">&lt;/plugin&gt;</span></code></pre> <p>Note how we employed <code>os.detected.classifier</code> variable provided by <code>os-maven-plugin</code> to inject the platform-specific binary dependency.</p> <h1 id="generating-protocol-buffers-java-sources">Generating Protocol Buffers Java Sources</h1> <p>Now we have our <code>protoc</code> binary in <code>project.build.directory</code>. We can use <code>maven-antrun-plugin</code> plugin to execute <code>protoc</code> for compiling <code>.proto</code> files into <code>.java</code> sources.</p> <pre><code class="language-xml"><span class="c">&lt;!-- compile proto buffer files using copied protoc binary --&gt;</span> <span class="nt">&lt;plugin&gt;</span> <span class="nt">&lt;groupId&gt;</span>org.apache.maven.plugins<span class="nt">&lt;/groupId&gt;</span> <span class="nt">&lt;artifactId&gt;</span>maven-antrun-plugin<span class="nt">&lt;/artifactId&gt;</span> <span class="nt">&lt;version&gt;</span>${maven-antrun-plugin.version}<span class="nt">&lt;/version&gt;</span> <span class="nt">&lt;executions&gt;</span> <span class="nt">&lt;execution&gt;</span> <span class="nt">&lt;id&gt;</span>exec-protoc<span class="nt">&lt;/id&gt;</span> <span class="nt">&lt;phase&gt;</span>generate-sources<span class="nt">&lt;/phase&gt;</span> <span class="nt">&lt;configuration&gt;</span> <span class="nt">&lt;target&gt;</span> <span class="nt">&lt;property</span> <span class="na">name=</span><span class="s">"protoc.filename"</span> <span class="na">value=</span><span class="s">"protoc-${protobuf.version}-${os.detected.classifier}.exe"</span><span class="nt">/&gt;</span> <span class="nt">&lt;property</span> <span class="na">name=</span><span class="s">"protoc.filepath"</span> <span class="na">value=</span><span class="s">"${project.build.directory}/${protoc.filename}"</span><span class="nt">/&gt;</span> <span class="nt">&lt;chmod</span> <span class="na">file=</span><span class="s">"${protoc.filepath}"</span> <span class="na">perm=</span><span class="s">"ugo+rx"</span><span class="nt">/&gt;</span> <span class="nt">&lt;mkdir</span> <span class="na">dir=</span><span class="s">"${protobuf.output.directory}"</span> <span class="nt">/&gt;</span> <span class="nt">&lt;path</span> <span class="na">id=</span><span class="s">"protobuf.input.filepaths.path"</span><span class="nt">&gt;</span> <span class="nt">&lt;fileset</span> <span class="na">dir=</span><span class="s">"${protobuf.input.directory}"</span><span class="nt">&gt;</span> <span class="nt">&lt;include</span> <span class="na">name=</span><span class="s">"**/*.proto"</span><span class="nt">/&gt;</span> <span class="nt">&lt;/fileset&gt;</span> <span class="nt">&lt;/path&gt;</span> <span class="nt">&lt;pathconvert</span> <span class="na">pathsep=</span><span class="s">" "</span> <span class="na">property=</span><span class="s">"protobuf.input.filepaths"</span> <span class="na">refid=</span><span class="s">"protobuf.input.filepaths.path"</span><span class="nt">/&gt;</span> <span class="nt">&lt;exec</span> <span class="na">executable=</span><span class="s">"${protoc.filepath}"</span> <span class="na">failonerror=</span><span class="s">"true"</span><span class="nt">&gt;</span> <span class="nt">&lt;arg</span> <span class="na">value=</span><span class="s">"-I"</span><span class="nt">/&gt;</span> <span class="nt">&lt;arg</span> <span class="na">value=</span><span class="s">"${protobuf.input.directory}"</span><span class="nt">/&gt;</span> <span class="nt">&lt;arg</span> <span class="na">value=</span><span class="s">"--java_out"</span><span class="nt">/&gt;</span> <span class="nt">&lt;arg</span> <span class="na">value=</span><span class="s">"${protobuf.output.directory}"</span><span class="nt">/&gt;</span> <span class="nt">&lt;arg</span> <span class="na">line=</span><span class="s">"${protobuf.input.filepaths}"</span><span class="nt">/&gt;</span> <span class="nt">&lt;/exec&gt;</span> <span class="nt">&lt;/target&gt;</span> <span class="nt">&lt;/configuration&gt;</span> <span class="nt">&lt;goals&gt;</span> <span class="nt">&lt;goal&gt;</span>run<span class="nt">&lt;/goal&gt;</span> <span class="nt">&lt;/goals&gt;</span> <span class="nt">&lt;/execution&gt;</span> <span class="nt">&lt;/executions&gt;</span> <span class="nt">&lt;/plugin&gt;</span></code></pre> <h1 id="adding-generated-sources-into-the-package">Adding Generated Sources into the Package</h1> <p><code>protoc</code> compiler placed the generated Java sources into <code>protobuf.output.directory</code>. We need to add the sources in this directory to the package:</p> <pre><code class="language-xml"><span class="c">&lt;!-- add generated proto buffer classes into the package --&gt;</span> <span class="nt">&lt;plugin&gt;</span> <span class="nt">&lt;groupId&gt;</span>org.codehaus.mojo<span class="nt">&lt;/groupId&gt;</span> <span class="nt">&lt;artifactId&gt;</span>build-helper-maven-plugin<span class="nt">&lt;/artifactId&gt;</span> <span class="nt">&lt;version&gt;</span>${build-helper-maven-plugin.version}<span class="nt">&lt;/version&gt;</span> <span class="nt">&lt;executions&gt;</span> <span class="nt">&lt;execution&gt;</span> <span class="nt">&lt;id&gt;</span>add-classes<span class="nt">&lt;/id&gt;</span> <span class="nt">&lt;phase&gt;</span>generate-sources<span class="nt">&lt;/phase&gt;</span> <span class="nt">&lt;goals&gt;</span> <span class="nt">&lt;goal&gt;</span>add-source<span class="nt">&lt;/goal&gt;</span> <span class="nt">&lt;/goals&gt;</span> <span class="nt">&lt;configuration&gt;</span> <span class="nt">&lt;sources&gt;</span> <span class="nt">&lt;source&gt;</span>${protobuf.output.directory}<span class="nt">&lt;/source&gt;</span> <span class="nt">&lt;/sources&gt;</span> <span class="nt">&lt;/configuration&gt;</span> <span class="nt">&lt;/execution&gt;</span> <span class="nt">&lt;/executions&gt;</span> <span class="nt">&lt;/plugin&gt;</span></code></pre> <h1 id="shading-protocol-buffers-package">Shading Protocol Buffers Package</h1> <p>Say you are done with your project, which includes <code>protobuf-java</code> version 3.0.0-beta-1 as a dependency. What if there is another package that is included as a direct or transitive Maven dependency and injects <code>protobuf-java</code> version 2.5.0? Then you are doomed; you will get a package version conflict. In order to avoid this problem, you can leverage <code>maven-shade-plugin</code> to <em>relocate</em> <code>com.google.protobuf</code> package contents to a private package within your project:</p> <pre><code class="language-xml"><span class="c">&lt;!-- shade protobuf to avoid version conflicts --&gt;</span> <span class="nt">&lt;plugin&gt;</span> <span class="nt">&lt;groupId&gt;</span>org.apache.maven.plugins<span class="nt">&lt;/groupId&gt;</span> <span class="nt">&lt;artifactId&gt;</span>maven-shade-plugin<span class="nt">&lt;/artifactId&gt;</span> <span class="nt">&lt;version&gt;</span>${maven-shade-plugin.version}<span class="nt">&lt;/version&gt;</span> <span class="nt">&lt;executions&gt;</span> <span class="nt">&lt;execution&gt;</span> <span class="nt">&lt;phase&gt;</span>package<span class="nt">&lt;/phase&gt;</span> <span class="nt">&lt;goals&gt;</span> <span class="nt">&lt;goal&gt;</span>shade<span class="nt">&lt;/goal&gt;</span> <span class="nt">&lt;/goals&gt;</span> <span class="nt">&lt;configuration&gt;</span> <span class="nt">&lt;relocations&gt;</span> <span class="nt">&lt;relocation&gt;</span> <span class="nt">&lt;pattern&gt;</span>com.google.protobuf<span class="nt">&lt;/pattern&gt;</span> <span class="nt">&lt;shadedPattern&gt;</span>${project.groupId}.${project.artifactId}.shaded.protobuf<span class="nt">&lt;/shadedPattern&gt;</span> <span class="nt">&lt;/relocation&gt;</span> <span class="nt">&lt;/relocations&gt;</span> <span class="nt">&lt;/configuration&gt;</span> <span class="nt">&lt;/execution&gt;</span> <span class="nt">&lt;/executions&gt;</span> <span class="nt">&lt;/plugin&gt;</span></code></pre> <p>This will relocate the contents of <code>com.google.protobuf</code> package to <code>${project.groupId}.${project.artifactId}.shaded.protobuf</code> and make the classes accessible under this namespace. That is, instead of using</p> <pre><code class="language-java"><span class="kn">import</span> <span class="nn">com.google.protobuf.*</span><span class="o">;</span></code></pre> <p>in your project, you should use the new relocated package name:</p> <pre><code class="language-java"><span class="kn">import</span> <span class="nn">groupId.artifactId.shaded.protobuf.*</span><span class="o">;</span></code></pre> <p>(Note that you need to replace <code>groupId</code> and <code>artifactId</code> literals in the Java code.)</p> <h1 id="conclusion">Conclusion</h1> <p>In this post, I tried to summarize the necessary set of steps to compile Protocol Buffers schema into Java classes using plain Maven magic. This is important to get a platform-independent build for a project.</p> <p>The absence of a proper Maven plugin to handle all these steps for us causes quite a bit of <code>pom.xml</code> pollution. Nevertheless, (de)serializers for the messaging medium are generally distributed in a separate artifact, hence this will probably end up being the entire content of your <code>pom.xml</code>.</p>