State Key Laboratory of Intelligent Technology and Systems
Information Retrieval Group
THUIRDB: High Performance Key-Value DB
THUIRDB is a C + + language based library for high performance on a single key-value persistent storage and high-speed queries.
For example: The following corpus file (corpus_file) as follows:
Penny <-> liang
tsinghua <-> university
google <-> search engine
Where the former represents key, which represents the value.
In the storage operation is complete, enter an arbitrary key ( eg: tsinghua), then the system quickly gives the corresponding value ( eg:university).
Index compression rate (average index key-value pairs consume 1 ~ 2bit)
Good scalability of computing resources
Support the large amount of data (Theoretically, using 4G memory, we can support creating database with 10 billion records and retrieve data quickly, read the disk up to at least once per once search)
Easy to use (do not rely on other specific libraries)
Once creating large-scale data (100 million records~10 billion records), then do not modify (the read-only scenes).
We can create the database and query the data quickly that are the main advantages.
Its main function:
Batch create databases
Queries & concurrent queries
Sequential scanning database
THUIRDB is a basis for the development of the library, with more features to achieve the basis, for many of the features currently are not supported or still in the validation phase.
Does not support the SQL language
Only C language interface
Key: maximum of 512 bytes (can be changed if demand larger)
Value: maximum 4K bytes (can be changed if demand larger)
2011-04-19 invited to speak at the Institute of Computing Natural Language Processing group for THUIRDB principles reports and technical exchanges. See more
2011-05-31 invited to speak at the Institute of Computing Information Retrieval Group for THUIRDB principles reports and technical exchanges. See more.
2011-06-14 invited NetEase Hangzhou Institute for THUIRDB principles reports and technical exchanges.
2011-06-17 invited to the Shanghai Stock Exchange for THUIRDB principles reports and technical exchanges.
2011-06-21 invited to Beijing for THUIRDB principle Taobao reports and technical exchanges.
2011-06-24 invited database group at Tsinghua University describes THUIRDB works.
2011-06-28 invited to introduce THUIRDB should search technology works.
2011-06-29 invited on Sina Weibo introduces THUIRDB works.
2011-07-10 invited to participate in Taobao technical carnival activities and make guests’ report. See more.
2011-07-10 invited to Thailand for the principles of science and technology for THUIRDB reports and technical exchanges.
2011-11-09 invited people search for THUIRDB principles reports and technical exchanges.
2011-11-14 invited to speak at Microsoft Research Asia for THUIRDB principles reports and technical exchanges.
2011-12-03 invited to participate in Hadoop introduction THUIRDB works. See more.
2012-11-30 participate in Chinese information retrieval conference and make reports.
Copyright 2007.8 State Key Laboratory of Intelligent Technology and Systems, All Rights Reserved
Download link: download (6.1KB), and refer download(6.1KB) to the ReadMe.txt learn to use. Primary users using sh help.sh a key experience.
Download Link: pdf
Currently supported online system:
1) Weibo people search online system: xunren.thuir.org
2) GoOnReading System: duxiaqu.com
3) Multi-language Wordbank cikuapi.com