60 stories
·
4 followers

Dealmaster: Get a Google Daydream View VR headset for $40

1 Share
Dealmaster: Get a Google Daydream View VR headset for $40

Enlarge (credit: TechBargains)

Greetings, Arsians! Courtesy of our friends at TechBargains, we have another round of deals to share. Today's list is headlined by a deal on the coral version of Google's Daydream View VR headset, which is down to $40 at Verizon as of this writing.

While this is not the absolute lowest we've seen Google's mobile VR headset, it's still more than half off its standard $99 list price. Smartphone VR is still the lightest VR experience, but if you plan on buying a new Pixel 3, want to use it as your own personal movie theater, and don't want to splash the cash on a more-advanced and standalone headset like the upcoming Oculus Quest, the Daydream View is still a decent entry point.

If you have no interest in virtual reality, we also have deals on AMD processors, sous vide cookers, the Nvidia Shield, storage, and much more. Have a look for yourself below.

Read 4 remaining paragraphs | Comments

Read the whole story
pbouwdewijn
22 hours ago
reply
Share this story
Delete

Compile-time Dependency Injection With Go Cloud's Wire

1 Share

Overview

The Go team recently announced the open source project Go Cloud, with portable Cloud APIs and tools for open cloud development. This post goes into more detail about Wire, a dependency injection tool provided with Go Cloud.

What problem does Wire solve?

Dependency injection is a standard technique for producing flexible and loosely coupled code, by explicitly providing components with all of the dependencies they need to work. In Go, this often takes the form of passing dependencies to constructors:

// NewUserStore returns a UserStore that uses cfg and db as dependencies.
func NewUserStore(cfg *Config, db *mysql.DB) (*UserStore, error) {...}

This technique works great at small scale, but larger applications can have a complex graph of dependencies, resulting in a big block of initialization code that's order-dependent but otherwise not very interesting. It's often hard to break up this code cleanly, especially because some dependencies are used multiple times. Replacing one implementation of a service with another can be painful because it involves modifying the dependency graph by adding a whole new set of dependencies (and their dependencies...), and removing unused old ones. In practice, making changes to initialization code in applications with large dependency graphs is tedious and slow.

Dependency injection tools like Wire aim to simplify the management of initialization code. You describe your services and their dependencies, either as code or as configuration, then Wire processes the resulting graph to figure out ordering and how to pass each service what it needs. Make changes to an application's dependencies by changing a function signature or adding or removing an initializer, and then let Wire do the tedious work of generating initialization code for the entire dependency graph.

Why is this part of Go Cloud?

Go Cloud's goal is to make it easier to write portable Cloud applications by providing idiomatic Go APIs for useful Cloud services. For example, blob.Bucket provides a storage API with implementations for Amazon's S3 and Google Cloud Storage (GCS); applications written using blob.Bucket can swap implementations without changing their application logic. However, the initialization code is inherently provider-specific, and each provider has a different set of dependencies.

For example, constructing a GCS blob.Bucket requires a gcp.HTTPClient, which eventually requires google.Credentials, while constructing one for S3 requires an aws.Config, which eventually requires AWS credentials. Thus, updating an application to use a different blob.Bucket implementation involves exactly the kind of tedious update to the dependency graph that we described above. The driving use case for Wire is to make it easy to swap implementations of Go Cloud portable APIs, but it's also a general-purpose tool for dependency injection.

Hasn't this been done already?

There are a number of dependency injection frameworks out there. For Go, Uber's dig and Facebook's inject both use reflection to do runtime dependency injection. Wire was primarily inspired by Java's Dagger 2, and uses code generation rather than reflection or service locators.

We think this approach has several advantages:

  • Runtime dependency injection can be hard to follow and debug when the dependency graph gets complex. Using code generation means that the initialization code that's executed at runtime is regular, idiomatic Go code that's easy to understand and debug. Nothing is obfuscated by an intervening framework doing "magic". In particular, problems like forgetting a dependency become compile-time errors, not run-time errors.
  • Unlike service locators, there's no need to make up arbitrary names or keys to register services. Wire uses Go types to connect components with their dependencies.
  • It's easier to avoid dependency bloat. Wire's generated code will only import the dependencies you need, so your binary won't have unused imports. Runtime dependency injectors can't identify unused dependencies until runtime.
  • Wire's dependency graph is knowable statically, which provides opportunities for tooling and visualization.

How does it work?

Wire has two basic concepts: providers and injectors.

Providers are ordinary Go functions that "provide" values given their dependencies, which are described simply as parameters to the function. Here's some sample code that defines three providers:

// NewUserStore is the same function we saw above; it is a provider for UserStore,
// with dependencies on *Config and *mysql.DB.
func NewUserStore(cfg *Config, db *mysql.DB) (*UserStore, error) {...}

// NewDefaultConfig is a provider for *Config, with no dependencies.
func NewDefaultConfig() *Config {...}

// NewDB is a provider for *mysql.DB based on some connection info.
func NewDB(info *ConnectionInfo) (*mysql.DB, error) {...}

Providers that are commonly used together can be grouped into ProviderSets. For example, it's common to use a default *Config when creating a *UserStore, so we can group NewUserStore and NewDefaultConfig in a ProviderSet:

var UserStoreSet = wire.ProviderSet(NewUserStore, NewDefaultConfig)

Injectors are generated functions that call providers in dependency order. You write the injector's signature, including any needed inputs as arguments, and insert a call to wire.Build with the list of providers or provider sets that are needed to construct the end result:

func initUserStore() (*UserStore, error) {
    // We're going to get an error, because NewDB requires a *ConnectionInfo
    // and we didn't provide one.
    wire.Build(UserStoreSet, NewDB)
    return nil, nil  // These return values are ignored.
}

Now we run go generate to execute wire:

$ go generate
wire.go:2:10: inject initUserStore: no provider found for ConnectionInfo (required by provider of *mysql.DB)
wire: generate failed

Oops! We didn't include a ConnectionInfo or tell Wire how to build one. Wire helpfully tells us the line number and types involved. We can either add a provider for it to wire.Build, or add it as an argument:

func initUserStore(info ConnectionInfo) (*UserStore, error) {
    wire.Build(UserStoreSet, NewDB)
    return nil, nil  // These return values are ignored.
}

Now `go generate` will create a new file with the generated code:

// File: wire_gen.go
// Code generated by Wire. DO NOT EDIT.
//go:generate wire
//+build !wireinject

func initUserStore(info ConnectionInfo) (*UserStore, error) {
    defaultConfig := NewDefaultConfig()
    db, err := NewDB(info)
    if err != nil {
        return nil, err
    }
    userStore, err := NewUserStore(defaultConfig, db)
    if err != nil {
        return nil, err
    }
    return userStore, nil
}

Any non-injector declarations are copied into the generated file. There is no dependency on Wire at runtime: all of the written code is just normal Go code.

As you can see, the output is very close to what a developer would write themselves. This was a trivial example with just three components, so writing the initializer by hand wouldn't be too painful, but Wire saves a lot of manual toil for components and applications with more complex dependency graphs.

How can I get involved and learn more?

The Wire README goes into more detail about how to use Wire and its more advanced features. There's also a tutorial that walks through using Wire in a simple application.

We appreciate any input you have about your experience with Wire! Go Cloud's development is conducted on GitHub, so you can file an issue to tell us what could be better. For updates and discussion about the project, join the project’s mailing list.

Thank you for taking the time to learn about Go Cloud's Wire. We’re excited to work with you to make Go the language of choice for developers building portable cloud applications.

Read the whole story
pbouwdewijn
2 days ago
reply
Share this story
Delete

Calls between JavaScript and WebAssembly are finally fast

1 Share

At Mozilla, we want WebAssembly to be as fast as it can be.

This started with its design, which gives it great throughput. Then we improved load times with a streaming baseline compiler. With this, we compile code faster than it comes over the network.

So what’s next?

One of our big priorities is making it easy to combine JS and WebAssembly. But function calls between the two languages haven’t always been fast. In fact, they’ve had a reputation for being slow, as I talked about in my first series on WebAssembly.

That’s changing, as you can see

This means that in the latest version of Firefox Beta, calls between JS and WebAssembly are faster than non-inlined JS to JS function calls. Hooray! 🎉

Performance chart showing time for 100 million calls. wasm-to-js before: about 750ms. wasm-to-js after: about 450ms. JS-to-wasm before: about 5500ms. JS-to-wasm after: about 450ms. monomorphic JS-to-wasm before: about 5250ms. monomorphic JS-to-wasm before: about 250ms. wasm-to-builtin before: about 6000ms. wasm-to-builtin before: about 650ms.

So these calls are fast in Firefox now. But, as always, I don’t just want to tell you that these calls are fast. I want to explain how we made them fast. So let’s look at how we improved each of the different kinds of calls in Firefox (and by how much).

But first, let’s look at how engines do these calls in the first place. (And if you already know how the engine handles function calls, you can skip to the optimizations.)

How do function calls work?

Functions are a big part of JavaScript code. A function can do lots of things, such as:

  • assign variables which are scoped to the function (called local variables)
  • use functions that are built-in to the browser, like Math.random
  • call other functions you’ve defined in your code
  • return a value

A function with 4 lines of code: assigning a local variable with let w = 8; calling a built-in function with Math.random(); calling a user-defined function named randGrid(); and returning a value.

But how does this actually work? How does writing this function make the machine do what you actually want? 

As I explained in my first WebAssembly article series, the languages that programmers use — like JavaScript — are very different than the language the computer understands. To run the code, the JavaScript we download in the .js file needs to be translated to the machine language that the machine understands. 

alt

Each browser has a built-in translator. This translator is sometimes called the JavaScript engine or JS runtime. However, these engines now handle WebAssembly too, so that terminology can be confusing. In this article, I’ll just call it the engine.

Each browser has its own engine:

  • Chrome has V8
  • Safari has JavaScriptCore (JSC)
  • Edge has Chakra
  • and in Firefox, we have SpiderMonkey

Even though each engine is different, many of the general ideas apply to all of them. 

When the browser comes across some JavaScript code, it will fire up the engine to run that code. The engine needs to work its way through the code, going to all of the functions that need to be called until it gets to the end.

I think of this like a character going on a quest in a videogame.

Let’s say we want to play Conway’s Game of Life. The engine’s quest is to render the Game of Life board for us. But it turns out that it’s not so simple…

Engine asking Sir Conway function to explain life. Sir Conway sends the engine to the Universum Neu function to get a Universe.

So the engine goes over to the next function. But the next function will send the engine on more quests by calling more functions.

Engine going to Universum Neu to ask for a universe. Universum Neu sends the engine to Randgrid.

The engine keeps having to go on these nested quests until it gets to a function that just gives it a result. 

Rnadgrid giving the engine a grid.

Then it can come back to each of the functions that it spoke to, in reverse order.

The engine returning through all of the functions.

If the engine is going to do this correctly — if it’s going to give the right parameters to the right function and be able to make its way all the way back to the starting function — it needs to keep track of some information. 

It does this using something called a stack frame (or a call frame). It’s basically like a sheet of paper that has the arguments to go into the function, says where the return value should go, and also keeps track of any of the local variables that the function creates. 

A stack frame, which is basically a form with lines for arguments, locals, a return value, and more.

The way it keeps track of all of these slips of paper is by putting them in a stack. The slip of paper for the function that it is currently working with is on top. When it finishes that quest, it throws out the slip of paper. Because it’s a stack, there’s a slip of paper underneath (which has now been revealed by throwing away the old one). That’s where we need to return to. 

This stack of frames is called the call stack.

a stack of stack frames, which is basically a pile of papers

The engine builds up this call stack as it goes. As functions are called, frames are added to the stack. As functions return, frames are popped off of the stack. This keeps happening until we get all the way back down and have popped everything out of the stack.

So that’s the basics of how function calls work. Now, let’s look at what made function calls between JavaScript and WebAssembly slow, and talk about how we’ve made this faster in Firefox.

How we made WebAssembly function calls fast

With recent work in Firefox Nightly, we’ve optimized calls in both directions — both JavaScript to WebAssembly and WebAssembly to JavaScript. We’ve also made calls from WebAssembly to built-ins faster.

All of the optimizations that we’ve done are about making the engine’s work easier. The improvements fall into two groups:

  • Reducing bookkeeping —which means getting rid of unnecessary work to organize stack frames
  • Cutting out intermediaries — which means taking the most direct path between functions

Let’s look at where each of these came into play.

Optimizing WebAssembly » JavaScript calls

When the engine is going through your code, it has to deal with functions that are speaking two different kinds of language—even if your code is all written in JavaScript. 

Some of them—the ones that are running in the interpreter—have been turned into something called byte code. This is closer to machine code than JavaScript source code, but it isn’t quite machine code (and the interpreter does the work). This is pretty fast to run, but not as fast as it can possibly be.

Other functions — those which are being called a lot — are turned into machine code directly by the just-in-time compiler (JIT). When this happens, the code doesn’t run through the interpreter anymore.

So we have functions speaking two languages; byte code and machine code.

I think of these different functions which speak these different languages as being on different continents in our videogame. 

A game map with two continents—One with a country called The Interpreter Kingdom, and the other with a country called JITland

The engine needs to be able to go back and forth between these continents. But when it does this jump between the different continents, it needs to have some information, like the place it left from on the other continent (which it will need to go back to). The engine also wants to separate the frames that it needs. 

To organize its work, the engine gets a folder and puts the information it needs for its trip in one pocket — for example, where it entered the continent from. 

 It will use the other pocket to store the stack frames. That pocket will expand as the engine accrues more and more stack frames on this continent.

A folder with a map on the left side, and the stack of frames on the right.

Sidenote: if you’re looking through the code in SpiderMonkey, these “folders” are called activations.

Each time it switches to a different continent, the engine will start a new folder. The only problem is that to start a folder, it has to go through C++. And going through C++ adds significant cost.

This is the trampolining that I talked about in my first series on WebAssembly. 

alt

Every time you have to use one of these trampolines, you lose time. 

In our continent metaphor, it would be like having to do a mandatory layover on Trampoline Point for every single trip between two continents.

Same map as before, with a new Trampoline country on the same continent as The Interpreter Kingdom. An arrow goes from The Interpreter Kingdom, to Trampoline, to JITland.

So how did this make things slower when working with WebAssembly? 

When we first added WebAssembly support, we had a different type of folder for it. So even though JIT-ed JavaScript code and WebAssembly code were both compiled and speaking machine language, we treated them as if they were speaking different languages. We were treating them as if they were on separate continents.

Same map with Wasmania island next to JITland. There is an arrow going from JITland to Trampoline to Wasmania. On Trampoline, the engine asks a shopkeeper for folders.

This was unnecessarily costly in two ways:

  • it creates an unnecessary folder, with the setup and teardown costs that come from that
  • it requires that trampolining through C++ (to create the folder and do other setup)

We fixed this by generalizing the code to use the same folder for both JIT-ed JavaScript and WebAssembly. It’s kind of like we pushed the two continents together, making it so you don’t need to leave the continent at all.

SpiderMonkey engineer Benjamin Bouvier pushing Wasmania and JITland together

With this, calls from WebAssembly to JS were almost as fast as JS to JS calls.

Same perf graph as above with wasm-to-JS circled.

We still had a little work to do to speed up calls going the other way, though.

Optimizing JavaScript » WebAssembly calls

Even though JIT-ed JavaScript and WebAssembly speak the same language, they have different customs. They have different ways of doing things.

Even in the case of JIT-ed JavaScript code, where JavaScript and WebAssembly are speaking the same language, they still use different customs. 

For example, to handle dynamic types, JavaScript uses something called boxing.

Because JavaScript doesn’t have explicit types, types need to be figured out at runtime. The engine keeps track of the types of values by attaching a tag to the value. 

It’s as if the JS engine put a box around this value. The box contains that tag indicating what type this value is. For example, the zero at the end would mean integer.

Two binary numbers with a box around them, with a 0 label on the box.

In order to compute the sum of these two integers, the system needs to remove that box. It removes the box for a and then removes the box for b.

Two lines, the first with boxed numbers from the last image. The second with unboxed numbers.

Then it adds the unboxed values together.

Three lines, with the third line being the two numbers added together

Then it needs to add that box back around the results so that the system knows the result’s type.

Four lines, with the fourth line being the numbers added together with a box around it.

This turns what you expect to be 1 operation into 4 operations… so in cases where you don’t need to box (like statically typed languages) you don’t want to add this overhead.

Sidenote: JavaScript JITs can avoid these extra boxing/unboxing operations in many cases, but in the general case, like function calls, JS needs to fall back to boxing.

This is why WebAssembly expects parameters to be unboxed, and why it doesn’t box its return values. WebAssembly is statically typed, so it doesn’t need to add this overhead. WebAssembly also expects values to be passed in at a certain place — in registers rather than the stack that JavaScript usually uses. 

If the engine takes a parameter that it got from JavaScript, wrapped inside of a box, and gives it to a WebAssembly function, the WebAssembly function wouldn’t know how to use it. 

Engine giving a wasm function boxed values, and the wasm function being confused.

So, before it gives the parameters to the WebAssembly function, the engine needs to unbox the values and put them in registers.

To do this, it would go through C++ again. So even though we didn’t need to trampoline through C++ to set up the activation, we still needed to do it to prepare the values (when going from JS to WebAssembly).

The engine going to Trampoline to get the numbers unboxed before going to Wasmania

Going to this intermediary is a huge cost, especially for something that’s not that complicated. So it would be better if we could cut the middleman out altogether.

That’s what we did. We took the code that C++ was running — the entry stub — and made it directly callable from JIT code. When the engine goes from JavaScript to WebAssembly, the entry stub un-boxes the values and places them in the right place. With this, we got rid of the C++ trampolining.

I think of this as a cheat sheet. The engine uses it so that it doesn’t have to go to the C++. Instead, it can unbox the values when it’s right there, going between the calling JavaScript function and the WebAssembly callee.

The engine looking at a cheat sheet for how to unbox values on its way from JITland to Wasmania.

So that makes calls from JavaScript to WebAssembly fast. 

Perf chart with JS to wasm circled.

But in some cases, we can make it even faster. In fact, we can make these calls even faster than JavaScript » JavaScript calls in many cases.

Even faster JavaScript » WebAssembly: Monomorphic calls

When a JavaScript function calls another function, it doesn’t know what the other function expects. So it defaults to putting things in boxes.

But what about when the JS function knows that it is calling a particular function with the same types of arguments every single time? Then that calling function can know in advance how to package up the arguments in the way that the callee wants them. 

JS function not boxing values

This is an instance of the general JS JIT optimization known as “type specialization”. When a function is specialized, it knows exactly what the function it is calling expects. This means it can prepare the arguments exactly how that other function wants them… which means that the engine doesn’t need that cheat sheet and spend extra work on unboxing.

This kind of call — where you call the same function every time — is called a monomorphic call. In JavaScript, for a call to be monomorphic, you need to call the function with the exact same types of arguments each time. But because WebAssembly functions have explicit types, calling code doesn’t need to worry about whether the types are exactly the same — they will be coerced on the way in.

If you can write your code so that JavaScript is always passing the same types to the same WebAssembly exported function, then your calls are going to be very fast. In fact, these calls are faster than many JavaScript to JavaScript calls.

Perf chart with monomorphic JS to wasm circled

Future work

There’s only one case where an optimized call from JavaScript » WebAssembly is not faster than JavaScript » JavaScript. That is when JavaScript has in-lined a function.

The basic idea behind in-lining is that when you have a function that calls the same function over and over again, you can take an even bigger shortcut. Instead of having the engine go off to talk to that other function, the compiler can just copy that function into the calling function. This means that the engine doesn’t have to go anywhere — it can just stay in place and keep computing. 

I think of this as the callee function teaching its skills to the calling function.

Wasm function teaching the JS function how to do what it does.

This is an optimization that JavaScript engines make when a function is being run a lot — when it’s “hot” — and when the function it’s calling is relatively small. 

We can definitely add support for in-lining WebAssembly into JavaScript at some point in the future, and this is a reason why it’s nice to have both of these languages working in the same engine. This means that they can use the same JIT backend and the same compiler intermediate representation, so it’s possible for them to interoperate in a way that wouldn’t be possible if they were split across different engines. 

Optimizing WebAssembly » Built-in function calls

There was one more kind of call that was slower than it needed to be: when WebAssembly functions were calling built-ins. 

Built-ins are functions that the browser gives you, like Math.random. It’s easy to forget that these are just functions that are called like any other function.

Sometimes the built-ins are implemented in JavaScript itself, in which case they are called self-hosted. This can make them faster because it means that you don’t have to go through C++: everything is just running in JavaScript. But some functions are just faster when they’re implemented in C++.

Different engines have made different decisions about which built-ins should be written in self-hosted JavaScript and which should be written in C++. And engines often use a mix of both for a single built-in.

In the case where a built-in is written in JavaScript, it will benefit from all of the optimizations that we have talked about above. But when that function is written in C++, we are back to having to trampoline.

Engine going from wasmania to trampoline to built-in

These functions are called a lot, so you do want calls to them to be optimized. To make it faster, we’ve added a fast path specific to built-ins. When you pass a built-in into WebAssembly, the engine sees that what you’ve passed it is one of the built-ins, at which point it knows how to take the fast-path. This means you don’t have to go through that trampoline that you would otherwise.

It’s kind of like we built a bridge over to the built-in continent. You can use that bridge if you’re going from WebAssembly to the built-in. (Sidenote: The JIT already did have optimizations for this case, even though it’s not shown in the drawing.)

A bridge added between wasmania and built-in

With this, calls to these built-ins are much faster than they used to be.

Perf chart with wasm to built-in circled.

Future work

Currently the only built-ins that we support this for are mostly limited to the math built-ins. That’s because WebAssembly currently only has support for integers and floats as value types. 

That works well for the math functions because they work with numbers, but it doesn’t work out so well for other things like the DOM built-ins. So currently when you want to call one of those functions, you have to go through JavaScript. That’s what wasm-bindgen does for you.

Engine going from wasmania to the JS Data Marshall Islands to built-in

But WebAssembly is getting more flexible types very soon. Experimental support for the current proposal is already landed in Firefox Nightly behind the pref javascript.options.wasm_gc. Once these types are in place, you will be able to call these other built-ins directly from WebAssembly without having to go through JS.

The infrastructure we’ve put in place to optimize the Math built-ins can be extended to work for these other built-ins, too. This will ensure many built-ins are as fast as they can be.

But there are still a couple of built-ins where you will need to go through JavaScript. For example, if those built-ins are called as if they were using new or if they’re using a getter or setter. These remaining built-ins will be addressed with the host-bindings proposal.

Conclusion

So that’s how we’ve made calls between JavaScript and WebAssembly fast in Firefox, and you can expect other browsers to do the same soon.

Performance chart showing time for 100 million calls. wasm-to-js before: about 750ms. wasm-to-js after: about 450ms. JS-to-wasm before: about 5500ms. JS-to-wasm after: about 450ms. monomorphic JS-to-wasm before: about 5250ms. monomorphic JS-to-wasm before: about 250ms. wasm-to-builtin before: about 6000ms. wasm-to-builtin before: about 650ms.

Thank you

Thank you to Benjamin Bouvier, Luke Wagner, and Till Schneidereit for their input and feedback.

Lin is an engineer on the Mozilla Developer Relations team. She tinkers with JavaScript, WebAssembly, Rust, and Servo, and also draws code cartoons.

More articles by Lin Clark…

Let's block ads! (Why?)

Read the whole story
pbouwdewijn
9 days ago
reply
Share this story
Delete

A Brief History of High Availability

1 Share

I once went to a website that had “hours of operation,” and was only “open” when its brick and mortar counterpart had its lights on. I felt perplexed and a little frustrated; computers are capable of running all day every day, so why shouldn’t they? I’d been habituated to the internet’s incredible availability guarantees.

However, before the internet, 247 availability wasn’t “a thing.” Availability was desirable, but not something to which we felt fundamentally entitled. We used computers only when we needed them; they weren’t waiting idly by on the off-chance a request came by. As the internet grew, those previously uncommon requests at 3am local time became prime business hours partway across the globe, and making sure that a computer could facilitate the request was important.

Many systems, though, relied on only one computer to facilitate these requests––which we all know is a story that doesn’t end well. To keep things up and running, we needed to distribute the load among multiple computers that could fulfill our needs. However, distributed computation, for all its well-known upsides, has sharp edges: in particular, synchronization and tolerating partial failures within a system. Each generation of engineers has iterated on these solutions to fit the needs of their time.

How distribution came to databases is of particular interest because it’s a difficult problem that’s been much slower to develop than other areas of computer science. Certainly, software tracked the results of some distributed computation in a local database, but the state of the database itself was kept on a single machine. Why? Replicating state across machines is hard.

In this post, we want to take a look at how distributed databases have historically handled partial failures within a system and understand––at a high level––what high availability looks like.

Working with What We Have: Active-Passive

In the days of yore, databases ran on single machines. There was only one node and it handled all reads and all writes. There was no such thing as a “partial failure”; the database was either up or down.

Total failure of a single database was a two-fold problem for the internet; first, computers were being accessed around the clock, so downtime was more likely to directly impact users; second, by placing computers under constant demand, they were more likely to fail. The obvious solution to this problem is to have more than one computer that can handle the request, and this is where the story of distributed databases truly begins.

Living in a single-node world, the most natural solution was to continue letting a single node serve reads and writes and simply sync its state onto a secondary, passive machine––and thus, Active-Passive replication was born.

alt text

Active-Passive improved availability by having an up-to-date backup in cases where the active node failed––you could simply start directing traffic to the passive node, thereby promoting it to being active. Whenever you could, you’d replaced the downed server with a new passive machine (and hope the active one didn’t fail in the interim).

alt text

At first, replication from the active to the passive node was a synchronous procedure, i.e., transformations weren’t committed until the Passive node acknowledged them. However, it was unclear what to do if the passive node went down. It certainly didn’t make sense for the entire system to go down if the backup system wasn’t available––but with synchronous replication, that’s what would happen.

alt text

To further improve availability, data could instead be replicated asynchronously. While its architecture looks the same, it was capable of handling either the active or the passive node going down without impacting the database’s availability.

While asynchronous Active-Passive was another step forward, there were still significant downsides:

  • When the active node died, any data that wasn’t yet replicated to the passive node could be lost––despite the fact that the client was led to believe the data was fully committed.

  • By relying on a single machine to handle traffic, you were still bound to the maximum available resources of a single machine.

Chasing Five 9s: Scale to Many Machines

As the Internet proliferated, business’ needs grew in scale and complexity. For databases this meant that they needed the ability to handle more traffic than any single node could handle, and that providing “always on” high availability became a mandate.

Given that swaths of engineers now had experience working on other distributed technologies, it was clear that databases could move beyond single-node Active-Passive setups and distribute a database across many machines.

Sharding

Again, the easiest place to start is adapting what you currently have, so engineers adapted Active-Passive replication into something more scalable by developing sharding.

In this scheme, you split up a cluster’s data by some value (such as a number of rows or unique values in a primary key) and distributed those segments among a number of sites, each of which has an Active-Passive pair. You then add some kind of routing technology in front of the cluster to direct clients to the correct site for their requests.

alt text

Sharding lets you distribute your workload among many machines, improving throughput, as well as creating even greater resilience by tolerating a greater number of partial failures.

Despite these upsides, sharding a system was complex and posed a substantial operational burden on teams. The deliberate accounting of shards could grow so onerous that the routing ended up creeping into an application’s business logic. And worse, if you needed to modify the way a system was sharded (such as a schema change), it often posed a significant (or even monumental) amount of engineering to achieve.

Single-node Active-Passive systems had also provided transactional support (even if not strong consistency). However, the difficulty of coordinating transactions across shards was so knotted and complex, many sharded systems decided to forgo them completely.

Active-Active

Given that sharded databases were difficult to manage and not fully featured, engineers began developing systems that would at least solve one of the problems. What emerged were systems that still didn’t support transactions, but were dramatically easier to manage. With the increased demand on applications’ uptime, it was a sensible decision to help teams meet their SLAs.

The motivating idea behind these systems was that each site could contain some (or all) of a cluster’s data and serve reads and writes for it. Whenever a node received a write it would propagate the change to all other nodes that would need a copy of it. To handle situations where two nodes received writes for the same key, other nodes’ transformations were fed into a conflict resolution algorithm before committing. Given that each site was “active”, it was dubbed Active-Active.

alt text

Because each server could handle reads and writes for all of its data, sharding was easier to accomplish algorithmically and made deployments easier to manage.

In terms of availability, Active-Active was excellent. If a node failed, clients just needed to be redirected to another node that did contain the data. As long as a single replica of the data was live, you could serve both reads and writes for it.

alt text

While this scheme is fantastic for availability, its design is fundamentally at odds with consistency. Because each site can handle writes for a key (and would in a failover scenario), it’s incredibly difficult to keep data totally synchronized as it is being processed. Instead, the approach is generally to mediate conflicts between sites through the conflict resolution algorithm that makes coarse-grained decisions about how to “smooth out” inconsistencies.

Because that resolution is done post hoc, after a client has already received an answer about a procedure––and has theoretically executed other business logic based on the response––it’s easy for active-active replication to generate anomalies in your data.

Given the premium on uptime, though, the cost of potential anomalies was deemed greater than the cost of downtime and Active-Active became the dominant replication type.

Consistency at Scale: Consensus & Multi-Active Availability

While Active-Active seemed like it addressed the major problem facing infrastructure––availability––it had only done so by forgoing transactions, which left systems that needed strong consistency without a compelling choice.

For example, Google used a massive and complex sharded MySQL system for its advertising business, which heavily relied on SQL’s expressiveness to arbitrarily query the database. Because these queries often relied on secondary indexes to improve performance, they had to be kept totally consistent with the data they were derived from.

Eventually, the system grew large enough in size that it began causing problems for sharded MySQL, so their engineers began imagining how they could solve the problem of having both a massively scalable system that could also offer the strong consistency their business required. Active-Active’s lack of transactional support meant it wasn’t an option, so they had to design something new. What they ended up with was a system based around consensus replication, which would guarantee consistency, but would also provide high availability.

Using consensus replication, writes are proposed to a node, and are then replicated to some number of other nodes. Once a majority of the nodes have acknowledge the write, it can be committed.

alt text

Consensus & High Availability

The lynch-pin notion here is that consensus replication lies in a sweet spot between synchronous and asynchronous replication: you need some arbitrary number of nodes to behave synchronously, but it doesn’t matter which nodes those are. This means the cluster can tolerate a minority of nodes going down without impacting the system’s availability. (Caveats made for handling the downed machines’ traffic, etc.)

alt text

The cost of consensus, though, is that it requires nodes to communicate with others to perform writes. While there are steps you can take to reduce the latency incurred between nodes, such as placing them in the same availability zone, this runs into trade-offs with availability. For example, if all of the nodes are in the same datacenter, it’s fast for them to communicate with one another, but you cannot survive an entire datacenter going offline. Spreading your nodes out to multiple datacenters can increase the latency required for writes, but can improve your availability by letting an entire datacenter going offline without bringing down your application.

Multi-Active Availability

CockroachDB implements much of the learnings from the Google Spanner paper (though, notably, without requiring atomic clocks), including those features beyond consensus replication that make availability much simpler. To describe how this works and differentiate it from Active-Active, we’ve coined the term Multi-Active Availability.

Active-Active vs. Multi-Active

Active-Active achieves availability by letting any node in your cluster serve reads and writes for its keys, but propagates any changes it accepts to other nodes only after committing writes.

Multi-Active Availability, on the other hand, lets any node serve reads and writes, but ensures that a majority of replicas are kept in sync on writes (docs), and only serves reads from replicas of the latest version (docs).

In terms of availability, Active-Active only requires a single replica to be available to serve both reads of writes, while Multi-Active requires a majority of replicas to be online to achieve consensus (which still allows for partial failures within the system).

Downstream of these databases’ availability, though, is a difference of consistency. Active-Active databases work hard to accept writes in most situations, but then don’t make guarantees about the ability for a client to then read that data now or in the future. On the other hand, Multi-Active databases accept writes only when it can guarantee that the data can later be read in a way that’s consistent.

Yesterday, Today, Tomorrow

Over the last 30 years, database replication and availability have taken major strides and now supports globe-spanning deployments that feel like they never go down. The field’s first forays laid important groundwork through Active-Passive replication but eventually, we needed better availability and greater scale.

From there, the industry has developed two predominant paradigms of databases: Active-Active for applications whose primary concern is accepting writes quickly, and Multi-Active for those that require consistency.

May we all look forward to the day when we can harness quantum entanglement and move to the next paradigm in managing distributed state.

Illustration by Christina Chung

Let's block ads! (Why?)

Read the whole story
pbouwdewijn
12 days ago
reply
Share this story
Delete

Do You Really Know CORS?

1 Share

[unable to retrieve full-text content]

Comments
Read the whole story
pbouwdewijn
13 days ago
reply
Share this story
Delete

The future of Java and OpenJDK updates without Oracle

1 Share

The future of Java and OpenJDK updates without Oracle support

OpenJDK logo

Oracle recently announced that it would no longer supply free (as in beer) binary downloads for JDK releases after a six-month period, and neither would Oracle engineers write patches for OpenJDK bugs after that period. This has caused a great deal of concern among some Java users.

From my point of view, this is little more than business as usual. Several years ago, the OpenJDK 6 updates (jdk6u) project was relinquished by Oracle and I assumed leadership, and then the same happened with OpenJDK 7. Subsequently, Andrew Brygin of Azul took over the leadership of OpenJDK 6. The OpenJDK Vulnerability Group, with members from many organizations, collaborates on critical security issues. With the help of the wider OpenJDK community and my team at Red Hat, we have continued to provide updates for critical bugs and security vulnerabilities at regular intervals. I can see no reason why this process should not work in the same way for OpenJDK 8 and the next long-term support release, OpenJDK 11.

I’m happy to assume leadership of the JDK 8 and 11 update projects if I have the support of the community.

At Red Hat, we intend to provide support for OpenJDK 8 to our customers until 2023, and our policy of always “upstream first” implies that OpenJDK 8 will continue to be updated for critical bugs and security fixes until then. Something similar will happen for JDK 11.

In addition to the people and organizations currently helping with OpenJDK updates, I have received offers of help from organizations not currently involved, in particular from Amazon Web Services. This bodes well, but it may take time to get everyone up to speed working as part of the community.

There is also the question of back-porting important features from later OpenJDK releases to, for example, JDK 8. While new features, particularly performance-related ones, are undoubtedly nice to have, our first priority must be to not break anything: we must remember that we are stewards of a very precious piece of software. Only if we are sure that we’re not taking unnecessary risks should we do major back-ports. We also have to consider the maintenance burden. So, each proposal will have to be taken on its individual merits, and I don’t think we can have a one-size-fits-all policy for such things.

One question which frequently arises is that of how people will get free downloads of compiled OpenJDK binaries, as opposed to the source code downloads that are provided by OpenJDK. I believe that the OpenJDK updates project itself should commit to providing binaries when releases are made. (Having said that, if you’re using some kind of Linux distribution, I would strongly recommend that you use the OpenJDK packages that are provided by the system and its package manager: you should get better integration and ease of updating that way. Some people might be worried that their chosen distribution will not build, test, and package OpenJDK correctly, but if you don’t trust your distribution to build packages, you shouldn’t be using it at all.)

So, when we talk about OpenJDK binaries we’re mainly talking about Windows and Macintosh downloads. It will be up to the JDK updates projects to decide how and where to build the binaries. Having said that, my team at Red Hat is happy to commit to providing regularly updated, tested (and, in particular, TCK’d) Windows and Linux downloads, but we probably will need help building and testing on Macintosh. I’m sure we can get this done and we can continue to deserve the trust of Java users.

Keeping Java updated in the absence of support from Oracle engineers will be a challenge to the Java community, but I believe it is one we should enthusiastically embrace. It is a golden opportunity for us, the community, to show what we can do. A truly open and transparent OpenJDK updates project will encourage wider participation and benefit all Java users.

Share

Let's block ads! (Why?)

Read the whole story
pbouwdewijn
20 days ago
reply
Share this story
Delete
Next Page of Stories