Thursday, October 23, 2014

streamrail/concurrent-map


On today's post I am hosting our lead developer and researcher, Roi Lipman.
It's been a few days since Roi published StreamRail's concurrent-map package on our GitHub, and already we are receiving great feedback from the community: we have 4 contributors to the repository, it has been featured in various places around the web as a useful Golang resource and we are using it in our highly concurrent production environment.
Concurrent safe maps / Roi Lipman
I’m really excited about concurrency within Go. Coming from a C++ and Python background I love the ease Go brings to the table when it comes to concurrency: firing multiple tasks each within it own Go routines, communicating data / state using channels and more - life couldn’t be simpler. As such it comes a bit as a surprise to me that one of the most common and well used data structures in Go isn’t concurrent safe: The Map (AKA HashTable, HashMap, Associative Array).
Below are a number of quotes from Golang docs:
Data races are among the most common and hardest to debug types of bugs in concurrent systems. A data race occurs when two goroutines access the same variable concurrently and at least one of the accesses is a write.
Here is an example of a data race that can lead to crashes and memory corruption:
func main() {
c := make(chan bool)
m := make(map[string]string)
go func() {
m["1"] = "a" // First conflicting access.
c <- true
}()
m["2"] = "b" // Second conflicting access.
<-c
for k, v := range m {
fmt.Println(k, v)
}
}
Maps are not safe for concurrent use: it's not defined what happens when you read and write to them simultaneously. If you need to read from and write to a map from concurrently executing goroutines, the accesses must be mediated by some kind of synchronization mechanism. One common way to protect maps is with sync.RWMutex.”
Really ?!? Yes and there’s a good reason for that, once again taken from Golang docs

Why are map operations not defined to be atomic?

After long discussion it was decided that the typical use of maps did not require safe access from multiple goroutines, and in those cases where it did, the map was probably part of some larger data structure or computation that was already synchronized. Therefore requiring that all map operations grab a mutex would slow down most programs and add safety to few. This was not an easy decision, however, since it means uncontrolled map access can crash the program.
The language does not preclude atomic map updates. When required, such as when hosting an untrusted program, the implementation could interlock map access.”
So recently i’ve been working on a piece of code which access a map concurrently, as suggested above my map was indeed a part of some bigger data structure which I could have synchronized but in my case this didn’t made sense, eventually i’ve decided to write a “generic” concurrent map which uses string as its key and interface{} as its value, I feel like these are the most common types  for key, value pairs and will cover most of my use cases.
Although using interface{} as a value type adds the overhead of casting values back from the map as we retrieve them, currently we don’t really have a choice, its either this or implementing a concurrent map foreach type we need.
it would have been interesting if Go had a template mechanism which could allow us to define our own templated structs e.g. make(MyConcurrentMap[string]int)
As you can see, I’ve tried to keep its interface as close as possible to Go’s map:
    // Create a new map.
    map := cmap.New()

    // Add item to map, adds "bar" under key "foo"
    map.Add("foo", "bar")

    // Retrieve item from map.
    tmp, ok := map.Get("foo")

    // Checks if item exists
    if ok == true {
        // Map stores items as interface{}, hence we'll have to cast.
        bar := tmp.(string)
    }

    // Removes item under key "foo"
    map.Remove("foo")

You could find my concurrent map implementation on github. Feel free to clone, fork or contribute!
Cheers,

Roi Lipman StreamRail

No comments:

Post a Comment