[wplug] Large Database

Fri Mar 6 11:09:36 EST 2009

In response to Michael Semcheski <mhsemcheski at gmail.com>:

> On Fri, Mar 6, 2009 at 9:25 AM, DK <wplug at curlynoodle.com> wrote:
> > I need to implement a data acquisition system which will sample and
> > store large amounts of "time-series" data, that is hundred of millions
> > of records.  I would like to investigate using an open source
> > database.  Does anyone have suggestions?
> 
> One thing to consider, depending on the size of the data, and the
> kinds of searches you need to do...
> 
> My guess (based on a poor sampling size) is that moving data you're
> not searching through out of the database (ie, stuff that you might
> store in a blob or image field instead goes to a file) can be a good
> strategy where its possible.
> 
> My experience has been that BLOB's are relatively slow to retreive
> (though convenient to use)...  Anyone else concur or disagree?

I disagree.  Again, this might be a limitation of MySQL, but PostgreSQL
has little performance problems with storing lots of BLOB data.  And,
for _really_ large BLOBs, PostgreSQL has the large_object type, which
allows random access to different parts of the field (basically the
same semantics as random access on a file)

Granted, if you have too much BLOB data, it won't be able to cache it
all in RAM, which will slow things down, but you'd have that anyway
if you had all the files in the filesystem.

-- 
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/