My idea was a network protocol for sqlite database. This would be a client and a server libraries communicating over network(probably sockets). The client library wold have the same interface as current version of sqlite.so (sqlite.dll). The server library would be a wrapper around exsisting sqlite library.
Problems:
- Since the callback and VM API's would be to slow over network, it is neccessary to reimplement them on the client side to provide compatibility to the standard sqlite library.
- Passing pointers over network is not a good idea. (e.g. pointer to database, or to VM). They should be replaced by handles.
- Encoding of strings and integers. Distinguish between empty string and NULL string. BIG_ENDIAN/LITTLE_ENDIAN and so on.
- Security
- Performance
Implementation architecture: RPC or Sockets?
I tried out the rpcgen (generator of RPC). It generates a lot of useful code, but it can't handle NULL strings returned by sqlite.
Perhaps using sockets is a better idea.
Any suggestions and ideas are welcome. === An alternative RPC mechanism for client server is to use XML, for example XML-RPC. An XML document maps to an SQL dataset nicely, is human readable and because of the tags is very robust. It looks cumbersome, but actually works better than it looks.
Alex K.
Reply to the above: model strings as a discriminated union like so:
union nullable_string switch (bool isnull) { case TRUE: void; case FALSE: opaque string_value<>; };
Nico W.
I have some input on a network protocol for sqlite too (and may actually implement it).
I think that the best transport mechanism is ASCII text (i.e. human readable) over TCP. RPC interfaces are harder to use, limit portability and are much harder to debug.
Since all fields in sqlite are already strings, there is no additional overhead in converting data to ASCII, and there is no need to add transparency (i.e. for '\0') because the database designer had to do that anyway when the data was inserted.
I envisage the protocol that looks a little like the sqlite command line. An example session might look like this (lines prefixed by '>' are sent by the client:
>.cursor select * from foo
ok
>.desc
ok\0
bar\0integer\0
baz\0text\0
>.next 4
ok\0
3\0Hello\0
1\0there\0
3\0how\0
8\0are\0
>.next 4
err 1001 truncated result\0
8\0you?\0
The basic idea is that the first element of any reply indicates success or failure. The '.cursor' statement corresponds to the sqlite_compile() function. The '.next' statemet corresponds to a sqlite_step() call, with the optimization that multiple rows can be requested at once.
This type of interface would give a quasi client-server capabilty. The server is the only process that opens the database file. Clients connect over the network (even local clients). Each connected client would spawn a new server thread and sqlite's multi-threading features would take care of the concurrency problem.
With respect to the issues that you raise, here are my thoughts:
- The network will definitely reduce performance, however since the query is executed on the server, and in many cases the query execution time dominates, for a well-designed client-server application there is no reason that sqlite would be any worse that any other DB (for a single client).
- There should not be any pointers passed. I think that the protocol definition should loosely map to the C API, but only loosely.
- There is no need to distinguish between strings an integers since sqlite itself doesn't. Because all data is in strings there are no endianness issues. Representing NULL however is a problem. Perhaps '\1' can be used to represent NULL, though this would force the protocol to add transparency for embedded '\1' characters.
- If the protocol goes over TCP then security can be achieved by using TLS/SSL.
- See 1.
4/29/04 CJW.
<hr color=red> How about using xmlrpc protocol http://www.xmlrpc.com/
The human readable interface would have its advantage if you use it like sqlite console app. But this will be another API more... If some one just want to use sqlite console on a remote machine, it can be done over ssh.
The advantage of C API would be:
- No changes to existing applications are needed (just replace sqlite.so)
- Since sqlite console app is linked against sqlite.so, we automatically get network capable sqlite console app. (which itself provides some kind of human readable interface) But as i mentioned above, running sqlite console app on a remote machine doesn't require any new modifications - it can be done with ssh.
Performance issue: The performance in LANs would not suffer if you retrive results row by row - that's true. But over internet it's a little bit different. A simple ping takes about 70-100 ms(in a LAN 0.2-0.6 ms). If you look at speed comparisons between Postgres, MySQL and SQLite you can see, that response times for a query lies between 0.3 ms (with index) and 30 ms (without index). In my opinion depending on the situation both the transfer or the execution time may dominate.
NULL issue: sqlite provides an option to set custom representation for NULL strings. Perhaps you can use it some how.
Thanks for link to xmlrpc. Unfortunatelly the xml format produces to mush overhead. (e.g each array element must be surrounded by <value><string>..</string></value>)
How do you plan to use your protocol?
30.04.2004 / Alex K.
We can pass compressed data 01.05.2004
Please, don't use RPC or any protocol which uses unpredictble TCP ports. It will make firewalling them impossible. So please stay with static TCP port assigment.
30.04.2004 / Dan
Response to the above: ONC RPC does not imply using the ONC RPC portmapper (rpcbind). You can use ONC RPC with well-known ports. NFSv4 does, for example (port 2049).
Nico W.
Please consider Tabular Data Stream (TDS) protocol. This is the protocol used by Sybase and MS SQL Server. If you make the wire protocol TDS, then you can use many of the client tools available for SQL Server, etc., including FreeTDS and JDBC.
From the FreeTDS web site:
-
Technically speaking, FreeTDS is an open source implementation of the TDS (Tabular Data Stream) protocol used by these databases for their own clients. It supports many different flavors of the protocol and supports several APIs to access it. Additionally FreeTDS works with other software such as Perl and PHP, providing access from those languages as well. There is also a native (type 4) JDBC driver available for platform-independent Java clients (including Java Server Pages applications), with support for most of the JDBC 1 API and portions of the JDBC 2 API.
Tabular Data Stream (TDS) is defined here
See FreeTDS Project
The protocol is described on this page
OK. I have looked at TDS. Seems to be a rather complex thing. I also examined the SQLite code. In SQLite the sqlite_exec method uses the VM API. The sqlite_get_table uses sqlite_exec and several xxx_printf routines uses either sqlite_exec or sqlite_get_table. So making VM API network capable, we can than take code from sqlite itself to make other routines network capable.
The VM API could be extended to return multiple rows in one turn, instead of calling the _step method once for each row. This would improve performance for long queries.
Other restriction are custom functions. It would be rather slow to establish a connection every time, function is called by sqlite. This functions have to be installed on the server.
The authorization can be done through the sqlite_open method. Instead of file name we pass kind of URL e.g. "user:passwort@192.168.0.1:5000/mydatabase".
The specified database must contain table with users and their passwords. (Suggested table name : sqlite_users). Of course the server have to restrict every access to this table.
03.05.2004 / Alex K.
I have made a prototype for network version. There are binaries for windows and sources. Contains Server, Client library and Browser for testing. Needs a lot of testing and improving. Is some one interested in placing it on Source Forge?
http://www.it77.de/sqlite/sqlite.htm
28.05.2004 / Alex K.
06.12.2004
Check out SQLRelay:
http://sqlrelay.sourceforge.net/
You can build it with SQLite and it has its own API.
For security consider running the protocol over SSHv2 as a subsystem, or else consider using the GSS-API or SASL (if you use ONC RPC then use RPCSEC_GSS).
Nico W.