2011-11-11

Web technology stacks – from LAMP to Janos

The classic stack of small- to medium-scale web technologies is LAMP (Linux, Apache, MySQL, PHP). With the rise of JavaScript and NoSQL databases, another stack is poised to replace it: Janos (client-side JavaScript, Node.js, NoSQL database).

LAMP – the incumbent

Linux and its accompanying software made it cheap for startups to run a web server. The LAMP stack comprises the following components:
  • Linux: Unix, free.
  • Apache: a web server.
  • MySQL: a relational database
  • PHP: a programming language for web back ends.
LAMP transformed the internet industry by making previously expensive technology available for free.

Janos – the challenger

JavaScript’s client-side popularity triggered many interesting developments. The Janos stack (which is short for JaNoNoS) is one of the results:
  • Client-side JavaScript
  • Node.js
  • A NoSQL database (such as MongoDB or CouchDB)
Comments:
  • The importance of client-side JavaScript. Client-side JavaScript is much more important to the stack than many people realize. It changes the paradigm from client-server to something whose nature is more distributed: On one hand, clients perform more computations and might even communicate with other clients. On the other hand, servers are less responsible for the application logic and mostly become a data tier. An example: FunctionSource assembles its page dynamically in the browser, via JavaScript. As a result, clicking a link usually means that only a part of the page has to be replaced instead of sending the complete page from server to client. There is also a fallback – if a browser does not support JavaScript, the assembling code is executed on the server and the result sent to the client.

    The next step is already in development: With browsers gaining offline functionality such as embedded databases, the data tier is more about syncing databases than about the server managing the data and the client displaying it.

  • Same language on client and server. Not having to switch languages when going from server to client is a big plus. You can reuse much code (validation code, domain logic, etc.) and don’t have to mentally switch between two languages during development. The ability to execute client-side code on the server enables fallbacks if a client does not support JavaScript (see example in the previous item).
  • Data – a confluence of events. It is very fortunate for JavaScript programmers that two things have become popular: JSON as a data transfer format (for web services etc.) and NoSQL databases. Both are perfect fits for JavaScript: JSON uses JavaScript syntax. Schema-less databases make things as flexible on the database side as they are on the programming language side; you get the advantages of object-oriented databases without their messiness.
  • Where is the operating system in the acronym? I initially thought that the stack should include a “U” for a Unix-based operating system. But the truth is that operating system matters remarkably little, now that Node.js has a proper Windows port.
In production systems, Node.js is often used as a complement to more mature servers. But that is slowly changing. Furthermore, it is already a terrific system for smaller projects.

Another proposed acronym

@evanpro tweets:
PSST! #node.js apps backed by a NoSQL database are now known as the #nono stack. Pass it on!
But while that name sounds nice, it does not mention a key ingredient of the stack: client-side JavaScript.

15 comments:

JK said...

Thanks for an interesting article!

Alex P said...

thanks a lot for the article, 
i would really like to get your opinion on the following topics:
before you answer, could you please also tell us wether you have node production projects and wether or no you worked with evented i/o with other technologies?

what are the advantages of using node.js on server side, besides having JSON (it's quite easy to serialize/deserialize it almost everywhere nowadays) and having one language on front-end and back-end?is it more a technology for a scaling / stability, or closer to a replacement for PHP (small/mid-side projects, as you mentioned)? what are the downsides of an event-loop in node.js? how many things do you have to handle as a callback?

thanks!

Axel Rauschmayer said...

So far, I’ve only used Node.js for smaller experiments of mine. I can only compare the experience to programming Java: Love how quickly you get things done, miss the Eclipse IDE.
The comparison with PHP is a good one, it feels similar in many ways (from what I remember of PHP ;-). Node.js is also popular for real-time apps (via Socket.IO).

Event loop approach: you get many multi-tasking advantages, without the risks. Disadvantage: still a bit clumsy, but things will get better:
- http://www.slideshare.net/pgriess/nodejs-concurrency
- http://www.2ality.com/2011/03/make-nodejs-code-pretty-via-generator.html

Alex P said...

Well, socket.io is called a protocol but known to be a set of messy rules. It's not even version 1 yet. So, it may possibly be used for in-browser real-time stuff, but it's quite unlikely to be the solution for multi-program server-side environment. and better things to use instead of socket.io are (of course) socks.js (oss, from creators of RabbitMQ) and Pusher (commercial software). 

Event-loop and multi-tasking sounds as a lie. We have developed mini-version of libev internally, longer time ago, and changed librabbitmq-c to support non-blocking mode, but still it doesnt bring you context switches or real parallelism. Only way to achieve that is a) having a good vm (read: erlang) b) having a language that supports native OS threads underneath. Otherwise there's no real multi-tasking. Remember that "concurrency vs parallelism" chapter in java books, right?

Moreover, if your main purpose is to serve an HTTP page to browser, and you wait for database just for .001 second, there's no win at all to do anything during that time (and what exactly can you do? render half a page?). It can make a software so much more complicated. Libev itself was created around the paradigm of waiting event loop with schedulers, rather than the idea of doing useless non-costly operations while waiting for the database, since you can perform them @ any point, they are not your bottleneck, so it's kind of a lie, too.

It's not that I dislike JavaScript. Again, we have developed something that looks very similar to node.js in-house (except for we use RabbitMQ for messaging instead of libev, and have a very limited OS interaction toolkit). Google v8 is a very powerful technology. It's embeddable and portable, easy to hook and to cook stuff with. But it's not for real-time applications. It's not for hardcore processing or big calculations.

Next thing, having an event loop and callback-driven development. That is an extremely complicated and hard to debug process. If your execution flows waits for something, and continues afterwards, you will simply get tired of mocking. And if you do try to mock when testing, you will end up in a situation, when half of your application is rewritten and mocks still use old signatures. And it's almost impossible to track, other than manually. It's difficult enough in browser. How much more difficult could it be in server-side app, when loosing a bit of info costs a bit more than moving pixels here and there?

That's one of the reasons I keep posting rants about Node. It simply has nothing to do with real-time applications and parallelism. These are two common misconceptions. That is a new, unstable piece of software that is yet to develop and prove itself. As always, it's not a panacea. Let's just be a bit more skeptical about software we use. Since if we all raise our bars, things will get better in quality, and we won't need to take someone else's advises for granted, without investigation and knowledge about underlying concepts, which is most of time even more important than the tech itself. 

Cheers.

Axel Rauschmayer said...

- Socket.io is not a protocol, it’s a library (but you seem to be aware of that). So far, I have only heard good things about it. Your experience is different?
- The things you are doing sound much higher-end than what I would use Node.js for. Are you currently using LAMP?
- Node.js is in no way perfect: You do get *some* of the multi-tasking benefits (but not all of them), with few risks. You have to look towards the future for this one: Intel’s RiverTrail and Web Workers. The nested callbacks can be avoided via libraries or (in the future) task.js. Debugging is indeed sometimes tricky, but the quick turn-around helps.

I would never call Node.js a panacea and it makes sense to be skeptical about it. It is work in progress, but it has many interesting ideas and many people are very productive (and happy) with it *now*. Whether you like or not might depend on whether you are already a JavaScript fan. Then you can continue to use your favorite language and get to write shell scripts in it, to boot.

Alex P said...

socket.io is a library, sure. but it uses underlying protocol (self-invented) for communication between client and server, there even is a spec for it.https://github.com/LearnBoost/socket.io-spec , so, most likely it's up to it's inventors to define wether it's a protocol or no., protocol is not only tcp or http. We use/develop protocols in-house for inter-application communications, even though they use http and tcp underneath.

WebWorkers is an awesome idea, agreed. 
we don't use LAMP and honestly never did. Right now we use Clojure (and other JVM langs) for close-to-realtime stuff, and AMQP for messaging and Ruby for front-end applications and C with JavaScript (v8) for low-level problems.

Jake Verbaten said...

I'm going to bite here.


>  It's not even version 1 yet
So what? Version numbers matter these days? All software I realise never get's upto version "1".
Socket.io is stable and the best cross platform websocket abstraction I've seen. You want to use websockets in production today? Use socket.io
> Otherwise there's no real multi-tasking

That's a lie. Node.js does asynchronous IO. That means all IO is automatically parallelised. If you make 10 HTTP requests in asynchronously then all 10 will run in parallel. The fact that javascript is single threaded is being worked on.

Currently we simply run multiple processes (node --cluster will run your process n times with a small load balancer in front of it). There is work being on done on exposing threads through v8 isolates for 0.8 (january).

I will agree that it's far from ideal. Erlang is clearly a superior language. But that's a massive learning curve. And it simply doesn't have the community effort node has. Wish it did.> database just for .001 second, there's no win at all 

Of course there is. Do you really think those million nanoseconds couldn't be put to better use? That's a couple of million cpu cycles your wasting sitting on your ass. If you really want to scale you can't afford to waste cycles like that.>  It can make a software so much more complicated
Misconception. Node isn't more complex. It's different. And if you attack from a generic synchronous mindset yes it is more complex. I'm afraid you'll have to learn callbacks and how to do it elegantly.> they are not your bottleneck

Every single blocking operation is a bottleneck. The operation with the maximum latency is your server bottleneck. If this is 1ms then your stuck to 1000req/s. If this is 100ns your stuck to 10k req/s. >  But it's not for real-time applications. It's not for hardcore processing or big calculations.

Correct, it shouldn't be used for hard real-time applications. But you shouldn't use the web for hard real-time applications.
It shouldn't be used for hardcore processing or big calculations, but hey you have the database for that, and failing that it's easy to plugin C++ into v8 and expose it to your scripting language (js).

However for soft real-time applications, and that's what we mean when we talk about real-time on the web, you need reduced latency. Node & socket.io are very good at reduced latency.> That is an extremely complicated and hard to debug process.

Only if you don't know what your doing. However the platform and tools have not hit the maturity of say Java yet. It's more difficult, but your overstating that difference in difficulty.

>  you will simply get tired of mocking. 

I write unit tests every day for node.js. And I'll tell you now that since javascript is a flexible language they are easier to write then Java or C# or any other unit tests for any other platform I've used.

>  It's difficult enough in browser. How much more difficult could it be in server-side app

What are you even ranting about?> It simply has nothing to do with real-time applications and parallelism

Hard real-time, no. Soft real-time yes. As for parallelism, well non blocking IO gives you it for free. And then node --cluster gives you a ton more parallelism for free.
Mind you, whether you should be using it in production completely depends on how skilled you are with node & javascript.

Alex P said...

>if you really want to scale you can't afford to waste cycles like that.
Dear Jake, if I really want to scale I would use Clojure, Scala or Erlang. Node.js and scaling come from 2 different worlds. Oh sorry, not would, but I do use. 

> As for parallelism, well non blocking IO gives you it for free.
Say it wth me: concurrency and parallelism are different things. Node CAN NOT run multicore. 

Jake, let's not speak of skills, really. That's an immature discussion to claim on wether i'm skilled enough to use node in production or no. I mean - you don't really know me, and vice versa. Only thing I can tell you - yes, I decided not to use Node in production for Airport access control system install. I decided to stay with libev and not rely on bindings someone came up with, whereas I disagree with most of the code written. 

I use google v8, i use libev, i use messaging, async and realtime things. But node.js has nothing to do with it. Simply nothing. 

If you say that socket.io is great, it's fine. Only guys that did manage to grasp websockets well were Pusherapp. But I still prefer socks.js from rabbitmq guys to it. 

>That means all IO is automatically parallelised. If you make 10 HTTP requests in 
> asynchronously then all 10 will run in parallel. The fact that javascript is single threaded 
> is being worked on.

That's quite a funny sentence. You say it's "parallelised" and right after you say it's "single threaded". That's a controversy. 

Jake, Node is a technology that is built on lies. I like the idea of the project itself, but I can't stand the way people are treating it. I wrote it about 10 times but I write it again here: i have built a system that uses Rabbitmq for async/evented i/o together with libv8 that runs on ARM processors with FS bindings and whatever you may possibly need. That's a lightweight and simplified version of node.js. That was designed especially for one project, and i'm not claiming that it's a world saving software.

Believe me, having 2 processes running on 2 cores will finish executing code faster than context-switching v8 on top of libev, which runs 2 concurrent execution flows. Why? Because POSIX scheduler is freaking faster (first), and v8/libev are still under it. 

Don't take it personally. I like the idea of node, and we did borrow a lot of ideas from it when working on our project. But let's be honest about the software we use. Let's not try to sell it to anyone, and say "it will be better tomorrow, so it's superior today" type of thing.

Axel Rauschmayer said...

Please guys! Keep it civil.

> Jake, let's not speak of skills, really. That's an immature discussion
to claim on wether i'm skilled enough to use node in production or no.
The above is a response to “Mind you, whether you should be using it in production completely
depends on how skilled you are with node & javascript.” I read this as “depends on how skilled *one* is” not as “you, Alex P., are”.

> Say it wth me: concurrency and parallelism are different things. Node CAN NOT run multicore.
Can you explain? Why can’t the multiple concurrent processes that Node.js (or a tier it contacts) runs not be scheduled on different cores? You can also cluster processes:

http://nodejs.org/docs/latest/api/cluster.html

> You say it's "parallelised" and right after you say it's "single threaded".
The JavaScript code is mostly single-threaded, but it triggers concurrent code.

> Jake, Node is a technology that is built on lies.
Flame bait. Please phrase more carefully. And be specific: What are the lies?

> i'm not claiming that it's a world saving software.
Are you getting hung up on the fact that some Node.js people are zealots? I know plenty of Node.js proponents who will tell you openly about its faults and when *not* to use it. But they are also very productive with it!

> Believe me, having 2 processes running on 2 cores will finish executing
code faster than context-switching v8 on top of libev, which runs 2
concurrent execution flows.
Do you have numbers? It is certainly possible, but things that sound plausible aren’t always true, so a benchmark would really help here.

Jake Verbaten said...

Yes if you really want to scale use haskell/erlang. Agreed.


> Node CAN NOT run multicore. 
Ah come on, `node --cluster` multicore. done.> Jake, let's not speak of skills, really.

You misinterpret. Were not talking about whether your skilled enough to use node. Were talking about whether you have more skills in another area then node. If you have more skills in another area use that area. 

I'm personally more skilled in node & js then any other platform so I use node.

> You say it's "parallelised" and right after you say it's "single threaded"

Oh come on. clearly what I mean is that IO is parallelized and application logic written in javascript is single threaded. > Don't take it personally. I like the idea of node, and we did borrow a lot of ideas from it when working on our project. But let's be honest about the software we use. Let's not try to sell it to anyone, and say "it will be better tomorrow, so it's superior today" type of thing.

That nice and all. But you havn't mentioned anything about how node is bad apart from erlang and haskell are better (which they are).
Node is still a great technology.

Alex P said...

my bad. 
removed my rants. i shouldn't have started that discussion anyways, at least in internet. 

i though that i would much rather write a longish blog-post about unix scheduler, select(2), epool(2), pthreads, event loops, libev and what's happening while you block if you do. if everyone's aware of these things, at least he knows very well why he's using node, and what are the downsides. although most likely i should not affiliate it with node at all, since it's likely to turn it all to flame. 

of course i shouldn't have said that node is built on lies. 

Axel Rauschmayer said...

Sounds like an interesting topic.
“although most likely i should not affiliate it with node at all, since it's likely to turn it all to flame” – not necessarily. There is a reason for most technical decisions. If you acknowledge those reasons then writing about caveats shouldn’t be a problem.

Jay Martin said...

Just found your blog and really enjoying your articles. You have a knack for concisely covering a lot of ground.
I'm new to programming in general, so I'm very fond of the idea of only learning one language, at least to start with. Node and the npm are a bit overwhelming for me at this point. All the glue code, dependencies and build process, as well as administering a separate database was dizzying for a newbie like me. I've been playing around with Wakanda, which has an integrated NoSQL database that rivals MongoDB for speed and CouchDB for reliability, at least on paper. Wakanda can also be used as the data layer for Node. It provides a GUI-based model designer and is a lot of fun so far. I'm still a little dizzy, but things are getting clearer every day. Thanks again for your great JavaScript content!

Axel Rauschmayer said...

“Just found your blog and really enjoying your articles.”
Thanks! Always good to get positive feedback.

“I'm new to programming in general, so I'm very fond of the idea of only learning one language, at least to start with.”
My thoughts on JavaScript being a good choice in this regard are here:
http://www.2ality.com/2012/09/javascript-glass-half-full.html

Tips:
- You can start very simple in Node.js. Use npm only to install modules, require your own modules as files in the same directory via require("./mymodule"). And work with the basic MongoDB driver:
https://npmjs.org/package/mongodb

- If you are just starting with programming, check out the following JavaScript-based course:
http://www.khanacademy.org/cs

- There are a few free books out there, including ones on Node.js:
http://jsbooks.revolunet.com/

Jay Martin said...

Awesome JavaScript resources Alex, thank you!

Web Analytics