xenji.com - My Dev Life: June 2012

Monday, June 25, 2012

Stateful web applications pt.2

This is a followup of my first article about stateful web applications, which you can find here, you may want to read it first.

Push state - poll state

We are still talking about stateful web applications. In my previous article I've talked about a challenge - response method to keep client and server side in sync. This method had some drawbacks but was simple to achieve.

The push state - poll state method is more complex concerning the setup, but way simple when it comes to the development part. Instead of using a single, serial channel of communication, we break up things and use two separate channels for our needs. This simple change gives us the possibility to acquire more change state requests as we could to with the sequential version (meaning the challenge - response method).

The setup

You will need something that accepts a high number of concurrent request. I would not recommend an Apache httpd for that. I've tried node.js and it worked. (nginx might fit as well, depending on your software stack.) This server, let's call it "change-bucket", gets all the change state requests from all clients via PUT request. It responds with some 2xx state, maybe 204 or 201. This is just the info, that the put-request reached the server. As you might imagine, this might lead to an eventual consistency state. The change-bucket passed the change request to a shared queue, which could be a redis server or something like ApacheMQ, RabbitMQ, or similar. The queue server maintains one list per application (of which multiple per user can exists side by side). By the way: redis is awsome for such a job, because it supports list operations natively - even blocking ones - and produces less overhead than a real message queue. The last thing you need is something like memcache. You can use redis or a real memcache server - depending on your setup. The key is: it must be fast and at least persistent as long as it has a power supply.

Now, that we have all change requests of the user in a queue, we need to get them out and change the server-side application state and reflect those changes to the client. This is also a point in this setup, where you can decide which way to go.

We will now switch perspectives and take a view from the client side now. This helps us to identifiy the point where those two ways will come together again.

The client

The client need a poll mechanism. You know, something like setTimeout() with a propper tail recursion. I would not recommend using call(), as it is of lower priority in the most JavaScript Engines. The intervall depends on your technology stack, but I recommend to make it possible to set it to 250ms or less. This enables you to fake a real-time feeling on the customer's side. This poll request must go somewhere and it needs to receive something. What does it receive? The full application state, where does it go? Now we reached the other side of our two way split.

Path 1 - blocking list operations

This way is of moderate complexity and lacks of scalable performance. You will need only one additional piece for your architecture puzzle. You need an endpoint for the client to ask for the latest application state and you will need to make this endpoint fetch the stack of change requests from redis. This fetch operation must be blocking to prevent a concurrency issue. The concurrency issue comes from requesting the same endpoint twice. The first requests has not persisted the new application state yet and the second request loads the old state from the memcache server. The last one who persists the state wins. In this case we would get a not recoverable, inconsistent state. Another problem: you will need to solve the unsuccessful put-requests in the change-bucket with e.g. a retry loop in the client.

Path2 - everything non-blocking

In this setup you will need another, daemon like instance to solve the problem of concurrency. We need to decouple the polling process from persisting the new state. We could do this by a cronjob or a node.js server or something completly different. The important aspect is, that this cronjob just takes data from the queue and merges it into the new state. The polling mechanism may see the same state twice, but that is OK as far as we got rid of our concurrency issues.

Conclusion

We've build a circle, a round-trip for our application state. You might now have the feeling, that this is a shit load of overhead regarding the task of "just" persisting an application state, but that is OK. It is a rather complex setup with it's own issues and I must admit, that it works only in theory at the moment.

Future

One of the big advantages of this setup is the possibility to scale it. If the polling mechnism reached it's scale, you can easily exchange the polling with persistent socket connections. If you do so, you might not even need a single change-bucket, but you could use the same socket to do both parts of the communication over a single architectual piece.

Monday, June 18, 2012

Stateful web applications pt.1

Preface

In modern web application we mostly use two sided setups. One is a client - which will be a browser in this case - the other is some sort of server with some sort of application. The languages used on the server side do not matter at all, but the client side language will be JavaScript.

It is all about state

The application has a state. This might mean some sort of dataset which the client works on or something which is brought to the client by the server and is fully or partly persisted to enable the client further requests in the same data context. This topic is highly connected to caching those datasets and working on those cached results rather than working on live datasets.

HTTP is stateless, which means that we must rely on other techniques to synchronize the server and the client state. You might call for (web-)sockets now, but not all companies are able to cope with a change of their infrastructure to use sockets for the number of users that are concurrently browsing the app.

Challenge - Response

One way of getting this done is a challenge - response communication protocoll between the server and the client. The client always adds a token to identifiy itself on the server side. In addition, when the client sends a request it adds a ticket id to it. The server processes the response either in realtime or asynchronous. When doing it in realtime, the server responds directly to the client request. The async request must trigger polling mechanism where the server answers at some time with the requested ticket id. This type of communication works well, as long as your requests do not depend on each other. If you need state changes in a certain order, you need to add some kind of counter to it. The server needs to keep up with the counter and cannot process counter number 5 before counter number 4 was processed successfully.

This might lead to a deadlock situation, where counter 4 is lost in space for some reason and the server got stuck at holding back anything above counter 4.

Another drawback might be the danger to slip into some sort of interpretation of the client state. The server gives you a bunch of values to set some of the states that were asked for in the last response. You might be forced to start guessing the state of former components based on the partial response.

Too abstract? Ok, here is an example. You open some sort of info box and it's content must be loaded from a server via an ajax call. The response comes back and you display your text. This box is one of three in a left column UI element. The UI concept says, that not more than one box should be open at the same time. OK, what happens next is common sense at the moment, but from my point of view it is a great danger. The client decides, that, on opening another box, the actual box has to be closed. This state never reaches the server. The client interpreted, that he must close it, because another box want be gain the state "open".

Lets get one step further. We have another, dependent view element that also requires a server request. The answer to that request could possibly open one of the three boxes. As the server's answer enforces the client to do that, the actual state of the client (which is unknown to the server) gets lost. The client need to interpred the result again and concludes, that the active open box must be closed to achieve the requirements of the former server response.

Where is the drawback now, you might ask. Try to let the client send a server request based on data that cannot be interpreted correctly be the client:

You have a entity to show via an id.

You need to show meta data based on the id.

You have a default setting on the root level of your application ("/")

You do not want to show the meta-data to the default id, but another one

If you just deliver the id, the client is not able to decide if it's default or not. The amount of "please client: if default do that, if not do this" raises and will kill you/your app some day.

The next article will be about using a "Push / Poll State Sync" approach.

Saturday, June 16, 2012

Building plugins for PHPStorm sucks - somehow

've started building my own plugins for Symfony2 on PHPStorm a while ago and yes, they make awesome IDEs.
If you ever wondered why there are so few plugins - compared to Eclipse - you might want to have a look at the available documentation for developing plugins in IntelliJ Idea based products. Sorry guys, but this is crap. There is no real API documentation anywhere but in the source code offered in form of the IntelliJ Community Edition. Yes, this might lead you to the assumption, that most of it is usable as a good starting point. Nope.

You might know the concept of extension points from Eclipse. It is an awesome concept and does also find it's place in Jetbrains' IDEs. The lack of documentation, which extension point is intended to be used for what kind of feature makes it nearly impossible to get started without asking rather "dumb" querstions in their forums.

I've tried to create a "Clickable Routes" plugin for supporting Symfony2 developers. It should enable you to click on a route string whithin a twig template's path() call. I've tried to use the same PsiReferenceContributor stuff which I successfully used before for my "Clickable Views" plugin. Unfortunatly twig is a template language and therefore handled differently in PHPStorm. The tokens of the template are not delivered to the contributor's extension point. I had to find this out myself as my post in Jetbrains' devnet was left unanswered by their developers.

This means I just can provide clickable routes in PHP files, but not in twig files. This sucks.

Thursday, June 7, 2012

fish - my new favorite shell

Thanks to @hochchristoph, I found my new default shell for my ubuntu systems. It's "fish". You can find it here: http://ridiculousfish.com/shell/.

And for the git users using __git_ps1... Here is my ~/.config/fish/config.fish, based on the info I found here.

Wednesday, June 6, 2012

PHPStorm and Symfony2

I've been working with Symfony 2 a lot these days and I saw the wonderful work, which Robert did with his Eclipse plugins. I've started to write a bunch of plugins for PHPStorm, which - I hope so - will ease the way of developing Symfony 2 apps in PHPStorm.

You can find them here: http://xenji.github.com/phpstorm-symfony2-plugin/

Tuesday, June 5, 2012

What is so bad about ... PHP

or web technology? I hear so many people ranting these days. About PHP, about MongoDB and so many other things. I got bored about it, as anything said about it is not new. PHP has been the way it is for over a decade, so why do you fuck with it ... right now? It's the same thing with Java, but Java has prooven itself as an, so called, enterprise technology, which lifts it's status above criticism? PHP has been there, most likely before ruby got famous and before things like django came up to the status at which they reside now. All of them are good in solving problems we never had before they came up. We built websites without MVC, without Doctrine, Symfony2, Rails or ActiveRecord or Document Databases. Don't get me wrong, I like the NoSQL movement! But as it says: "not only" - means for me "in both directions". The thing is, that we've reached a scale, which needs fresh ideas. Rails was a good one, django and Symfony2, too. Esp. Symfony2 showed the PHP world how you can use PHP in an enterprise manner. All you people ranting about PHP, look at this framework, tell me if it does not solve the same kind of problem, which you solve with Rails or whatever tool you use! What you like or dislike is not important for solving a problem. Use the right tool for your problem. These words are old, maybe older than me. They still count. By the way: PHP moved to github. That is your chance to contribute and change the things you do not like - instead of just complaining about them.