Recently I did a web application to make easy GNATS report for my team. I use scrapy to crawl the GNATS web pages for people's issues every 4 hours, then add the crawled data into mongodb. A set of simple-to-use RESTful APIs written with nodejs can provide easy access to the data (try it out, but only viewable internally in Juniper). Then a django application consumes the APIs and wraps them into a not-so-bad user interface, thanks to twitter bootstrap and a set of javascript frameworks and libraries. You can look at the ultimate application here: GNATS report system.
What I want to emphosize is, I did all these stuffs in less than a full week, and then a working edition is there. I've not counted the code, maybe a few thousands line of python, coffescript, css and html code.
Here comes the problem - I can never be so productive when writing code for the datapath of JUNOS, or any embedded systems I've been working with. Why is it?
Then I began to compare the differences for my web application and the data path application, say NAT for embedded systems, to find the magic.
Yes you can argue the embedded systems are much more complicated, but making not so complicated things complex is much easier than making it simple, right? If you think about a whole web systems, you need to deal with web server, db server, message queue server, cache server, etc., and each of them expose every detail to you and you have to learn all of them, who can write web applications so easy?
You may also argue that performance is everything so we had to sacreface almost everything. But is it really the right way of doing it, considering the hardware is getting a lot better and better than 15 years ago? 15 years ago, web applications are also written in C or equivalent, in terms of performance I guess? But what about now? Our mind should change with the era. Furthermore, is it easier to write the architecture right firstly then optimize it afterwards? Or it is easier to optmize the application firstly even if making a lots of things in a mess then evolve the whole mess? The answer is transparent.
So here's my point - let's try to make the architecture right with a framework with usability for developer bear in mind.
Here I'll try to come out a crazy data plane framework. I don't even know if it works but it is a good stress test for your brian when you're in a 12-hour flight with nothing else to do.
Engine is a very light-weight component like thread, but the memory footprint is much less and there's no data copy between engines. To boost the performance, multiple instance of the same engine could be run simutaneously.
Engine is the minimum unit in the forwarding path, following open-close principle. An engine should do and only do one thing, it usually should not be changed when introducing a new feature. For example, you should not create a l3 forward engine which combined lots of stuffs in it. Instead, you should do something like a TTL engine, which just decrease the TTL of the given packet.
An engine have an inqueue and outqueue to hold packet to be processed and to be sent to the next engine. The writer of the engine usually isn't aware of it. To move packet forward to the next engine, current engine just need to call API like this:
return next_engine();
This API will automatically calculate the next-to-call engine, and distribute to one of the not-so-busy instances of the engine.
To write an engine, you basically need:
Command
and implement methods like set
, get
, debug
. Configuration is stored into memory based key-value database.Engine
and implement c2s(pkt)
and s2c(pkt)
. They will be called by process(pkt)
based on the traffic direction.Path is a set of engines that packet will go through. Usually the first packet packet will trigger a session installation, which creates bidirection paths for the session.
Path is organized as bitmap.
The next_engine()
API will work with path to decide the next engine, but the engine owner doesn't need to deal with path.
Session is almost the same concept as what a typical firewall session is. It's 5-tuple based, bi-direction data structure that provides enough information for engines to process packets.
Sessions are stored in a memory based key-value database. It can be queried and modified by database API.
After session lookup, each packet data structure will have a copy of the matched session. Except invalidation, normally engines should not modify sessions in database. Only session lookup engine could modify session - e.g. the sequence number, the statistics info, etc.
There are two session classes: SessionStore
, Session
.
SessionStore
has the static match
method, if you inherit SessionStore
, get_key()
and modify_on_match()
should be implemented.
SessionStore
will be separated based on protocols.
Session
has execute()
, which will call the engine path attached to the direction that packet comes.
TunnelSessionStore
inherits SessionStore
. TunnelSession
inherits Session
.
TunnelEngine
inherits Engine
, which c2s()
calls encap(pkt)
, s2c()
calls decap(pkt)
. So you need to implement encap()
and decap()
.
There are much more to consider, for example, TCP Proxy, ALG, IDP, AI, QoS, etc. But unfortunately my flight is almost over and I need to release my brian for something more relex.
DPDkit can zero-copy the packet from driver to user space. This is a great news for this idea.
I'm struggling for a while about the langyages I should choose. To me, golang is too young, python/ruby is too simple, and c/c++ is just naive to do it. Erlang, on the contrary, seems to be a smart choice for it.
I know little about erlang. You can see my previous psudo code are all in c/python. I don't even know if erlang supports OOP (I guess so). What I think it is the right choice is because:
So the next step is to learn erlang, and to try to write a framework, which by adding an engine, I can make the basic pass-through TCP traffic work without any issue.
Don't laugh at me if you're expert. It is not an architecture spec. I just let my thought fly and record it faithfully.
如果您对本站的文章感兴趣,欢迎订阅我的微博公共账号:程序人生。每次博文发表时,您都能获得通知。此外,公共账号还会不定期推送一些短文,技术心得,供您参考。