In Scribd we have tons of analytics data generated daily that should be somehow processed. Currently we use MySQL to store all this stuff, but that is not the best option for logging lots of data. So we’ve decided to try some more specialized tools, which could help us to store and process our data. The most interesting thing which could simplify analytics data collecting was Scribe. As it turned out, installation process is not so simple as expected so here you will find a few steps manual on how to install Scribe on a developer machine.
0. Prerequisites
First thing you’ll need is to install thrift library that is used by Scribe to do all the networking communication. To build it you need to have boost C++ library installed:
1 | sudo port install boost |
1. Installing Thrift
Now we are ready to download and build thrift. I prefer to keep all manually built tools in /opt:
1 2 3 4 5 6 | git clone git://git.thrift-rpc.org/thrift.git cd thrift ./bootstrap.sh ./configure --prefix=/opt/thrift sudo make sudo make install |
Please note, that make command needs root privileges (it installs Ruby bindings to the system folder).
You will also need to install fb303 library (it is used in all facebook/thrift related tools to do status/health monitoring calls):
1 2 3 4 5 | cd contrib/fb303 ./bootstrap.sh ./configure --prefix=/opt/fb303 --with-thriftpath=/opt/thrift make sudo make install |
2. Installing Scribe
Download Scribe from Sourceforge. Of course, you can use current development version from SVN.
1 2 3 4 5 | cd scribe ./bootstrap.sh ./configure --prefix=/opt/scribe --with-thriftpath=/opt/thrift --with-fb303path=/opt/fb303 make sudo make install |
3. Configuring Scribe
I’ve created the /opt/scribe/conf directory and copied examples/example1.conf configuration there:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 | ## ## Sample Scribe configuration ## # This file configures Scribe to listen for messages on port 1463 and write # them to /tmp/scribetest port=1463 max_msg_per_second=2000000 check_interval=3 # DEFAULT <store> category=default type=buffer target_write_size=20480 max_write_interval=1 buffer_send_rate=2 retry_interval=30 retry_interval_range=10 <primary> type=file fs_type=std file_path=/tmp/scribetest base_filename=thisisoverwritten max_size=1000000 add_newlines=1 </primary> <secondary> type=file fs_type=std file_path=/tmp base_filename=thisisoverwritten max_size=3000000 </secondary> </store> |
Now we are ready to start scribe server:
1 | sudo /opt/scribe/bin/scribed -c /opt/scribe/conf/example1.conf |
4. Creating a test Ruby client application
First thing you’ll need is to gather all Ruby bindings into your app directory:
1 2 3 4 5 | mkdir testapp cd testapp /opt/thrift/bin/thrift -o . -I /opt/fb303/share/ --gen rb /path/to/downloaded/scribe/if/scribe.thrift /opt/thrift/bin/thrift -o . -I /opt/fb303/share/ --gen rb /opt/fb303/share/fb303/if/fb303.thrift mv gen-rb scribe |
Do not forget to replace /path/to/downloaded/scribe with real path where you have extracted Scribe sources to.
And here is the test console Ruby script:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | #!/usr/bin/env ruby $LOAD_PATH.unshift(File.dirname(__FILE__) + '/scribe') require 'scribe' begin socket = Thrift::Socket.new('localhost', 1463) transport = Thrift::FramedTransport.new(socket) protocol = Thrift::BinaryProtocol.new(transport, false) client = Scribe::Client.new(protocol) transport.open() log_entry = LogEntry.new(:category => 'test', :message => 'This is a test message') client.Log([log_entry]) transport.close() rescue Thrift::Exception => tx print 'Thrift::Exception: ', tx.message, "\n" end |
Woot, it works! Time to start creating your highly productive logging/data collection system.
P.S. On my machine I got the following output from this script:
1 2 3 4 5 6 7 | ./scribe/fb303_types.rb:9: warning: already initialized constant DEAD ./scribe/fb303_types.rb:10: warning: already initialized constant STARTING ./scribe/fb303_types.rb:11: warning: already initialized constant ALIVE ./scribe/fb303_types.rb:12: warning: already initialized constant STOPPING ./scribe/fb303_types.rb:13: warning: already initialized constant STOPPED ./scribe/fb303_types.rb:14: warning: already initialized constant WARNING ./scribe/fb303_types.rb:15: warning: already initialized constant VALID_VALUES |
To fix this issue, open the generated scribe/scribe_types.rb and replace require 'fb303_types'
line with this:
1 | require File.dirname(__FILE__) + '/fb303_types' |
Many thanx for suggesting Scribe!
Btw, if somebody would try to compile Thrift and Scribe on FreeBSD: for me it works just with “gmake”.
By the way, for Perl there is
Log::Dispatch::Scribe
module on CPAN. It has also script scribe_cat.pl within.Many Thanks for the wonderful instructions to install scribe on OS X.
I am trying to install it on Mac OSX (leopard: 10.5.6)
I religiously followed the instructions and was able to install boost and thrift successfully. When I am trying to run “make” command during scribe installation, I am running into the following issue.
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
In file included from common.h:49,
from store.cpp:25:
/opt/thrift/include/thrift/server/TNonblockingServer.h:32:19: error: event.h: No such file or directory
/opt/thrift/include/thrift/server/TNonblockingServer.h:78: error: ISO C++ forbids declaration of ‘event_base’ with no type
/opt/thrift/include/thrift/server/TNonblockingServer.h:78: error: expected ‘;’ before ‘*’ token
/opt/thrift/include/thrift/server/TNonblockingServer.h:81: error: field ‘serverEvent_’ has incomplete type
/opt/thrift/include/thrift/server/TNonblockingServer.h:196: error: ISO C++ forbids declaration of ‘event_base’ with no type
/opt/thrift/include/thrift/server/TNonblockingServer.h:196: error: expected ‘;’ before ‘*’ token
/opt/thrift/include/thrift/server/TNonblockingServer.h:200: error: expected `;' before ‘void’
/opt/thrift/include/thrift/server/TNonblockingServer.h:248: error: ‘event_base’ has not been declared
/opt/thrift/include/thrift/server/TNonblockingServer.h: In constructor ‘apache::thrift::server::TNonblockingServer::TNonblockingServer(boost::shared_ptr, int)’:
/opt/thrift/include/thrift/server/TNonblockingServer.h:113: error: class ‘apache::thrift::server::TNonblockingServer’ does not have any field named ‘eventBase_’
/opt/thrift/include/thrift/server/TNonblockingServer.h: In constructor ‘apache::thrift::server::TNonblockingServer::TNonblockingServer(boost::shared_ptr, boost::shared_ptr, int, boost::shared_ptr)’:
/opt/thrift/include/thrift/server/TNonblockingServer.h:126: error: class ‘apache::thrift::server::TNonblockingServer’ does not have any field named ‘eventBase_’
/opt/thrift/include/thrift/server/TNonblockingServer.h: In constructor ‘apache::thrift::server::TNonblockingServer::TNonblockingServer(boost::shared_ptr, boost::shared_ptr, boost::shared_ptr, boost::shared_ptr, boost::shared_ptr, int, boost::shared_ptr)’:
/opt/thrift/include/thrift/server/TNonblockingServer.h:148: error: class ‘apache::thrift::server::TNonblockingServer’ does not have any field named ‘eventBase_’
/opt/thrift/include/thrift/server/TNonblockingServer.h: At global scope:
/opt/thrift/include/thrift/server/TNonblockingServer.h:292: error: field ‘event_’ has incomplete type
/opt/thrift/include/thrift/server/TNonblockingServer.h:334: error: field ‘taskEvent_’ has incomplete type
/opt/thrift/include/thrift/server/TNonblockingServer.h: In member function ‘void apache::thrift::server::TConnection::setRead()’:
/opt/thrift/include/thrift/server/TNonblockingServer.h:354: error: ‘EV_READ’ was not declared in this scope
/opt/thrift/include/thrift/server/TNonblockingServer.h:354: error: ‘EV_PERSIST’ was not declared in this scope
/opt/thrift/include/thrift/server/TNonblockingServer.h: In member function ‘void apache::thrift::server::TConnection::setWrite()’:
/opt/thrift/include/thrift/server/TNonblockingServer.h:359: error: ‘EV_WRITE’ was not declared in this scope
/opt/thrift/include/thrift/server/TNonblockingServer.h:359: error: ‘EV_PERSIST’ was not declared in this scope
make[3]: *** [store.o] Error 1
make[2]: *** [all] Error 2
make[1]: *** [all-recursive] Error 1
make: *** [all] Error 2
Any pointers would be of great help!!
Thanks,
Tiru
@tiru,
For this error
You need to install libevent dev library, am not sure how to install it on mac.
Ok, tried the thing on Snow Leopard and noticed one thing that needs to be changed here: you need libevent to be installed as a prerequisite.
P.S. Make sure you reinstall thrift after libevent installation to make sure it’d build libthriftnb.