Hacker Newsnew | past | comments | ask | show | jobs | submit | nwmt's commentslogin

Not to take away from the substance of the article itself, but is anyone else surprised that they have 2 billion "documents", which presumably means active ads/listings? That seems like an awful lot.


MongoDB is being used for historical archiving, not for the live site itself. The big reason being that changing table schemas for very large sets of old data is painful with MySQL. So the 2 billion number would be any ad/listing older than a set amount of time.

The live data is < 1 TB and is still stored in MySQL.


Exactly.

The "set amount of time" typically hovers around 60 days, though our archiving process has been off for several months while the migration took place. So we have some catching up to do--somewhere in the neighborhood of 150 million postings, last I counted.


I've been hearing some good things about Riak lately and their masterless implementation seems quite interesting. Did Riak ever make your radar and, if so, what were the disadvantages that made you choose MongoDB?

Were I to guess based on the video, I would say lack of a Perl client and you'd probably end up having to roll too many of your own solutions on top of it?


I would have expected more, personally. Craigslist is massive, popular, and been around a long time. That ads up to a TON of listings.


~2.2 billion is a ton of listings. What you have to realize is that the craigslist wasn't in hundreds of cities on day #1. In recent years, we've had tens of millions of "live" ads on the site, but it took a bit of time to grow to that size.


This looks like a data warehousing of the archive. The two billion listings probably represents all expired ads ever. There is no way they have 2 billion active ads at any one time.


The above comment is correct.

The archive does have to be accessed by users though, since users can access listings from many years back.

The entire archive seems to be under 4 TB from what he described in the video (2 billion documents at 2 kilobytes each). They do not retain photos.


Yup. You hit the nail on the head.


How much photo data do you handle? How long do you keep it?


The photos are removed once the posting is no longer live on the site (roughly). As for how many, I'd have to dig a bit to find that out...


#1 is a great example of why founders should be vested. The rest aren't really about startups as much as teams (dev or sales) in general, but obviously they're still relevant to startups, and further illustrate how important the hiring process is for startups.


Not that I've had to directly use it, but a friend of mine was showing me their university's new student center portal called SOLUS, and it's awful. It's impossible to find anything on it, and yet it's the only way to change courses, check marks, etc. It's sort of like a router interface, but harder to use and swamped with more information. The worst part, of course, is how much money it must've cost to transition this year.


Clicking directly only gives you the first few paragraphs then says you need to subscribe. For the full article, Google it and click the article link: http://www.google.com/search?q=H-P+Chief+Warns+of+Tough+Time...


Agreed. The fact that it uses Hadoop should be secondary to who they are. "Software Company Raises Money" would also be factually correct, but not particularly useful.


I think it's a bit premature to suggest that either a) Facebook would move out of California or b) a new social network would start up outside of California and eat Facebook's lunch because it didn't have draconian laws. Surely there are easier ways to solve the problem at hand.


I'm only going from the article, which is evidently flawed (TechCrunched!), but it could effectively add a tax on social networking sites operating in the state. States compete for companies all the time and this could provide an opening for somewhere else to put together a more compelling package. Obviously the big drawback to leaving is the lack of tech people, so I guess you weigh your options. Personally, I feel like Facebook and Google would set up shop on the moon if it were in their financial interest and they could find someone to run it.


What you're missing is that the state the company is located in doesn't matter. In general, if it's doing business in California, it can be sued in California court. This has been the case for a long long time. Otherwise everyone would just set up in Delaware or whichever state was most inaccessible and/or sympathetic to corporate defendants.


At the risk of going off topic here, with some of the new things Qt is planning (getting away from its C++ roots), maybe it would be best if Qt 4.7 (which is in my opinion one of the best platforms to dev Windows programs, never mind cross-platform) was left as it is, and put into an open-source maintenance mode instead of trying to innovate.


It's worth noting that at the moment nearly 2400 people flat out want all software patents abolished.

Plus, lumping the other options in together isn't necessarily fair. Revoking existing software patents would have some pretty severe effects on legitimate companies and their shareholders. And the "other changes" is extremely vague - one could assume that other changes includes placing the onus on the company applying for the patent to prove its uniqueness, which would increase the cost to receive a patent and decrease the likelihood of abuse.

All in all, 19 votes at the moment support the current system. Less than 1%. And to be fair, roughly 15% believe better software patents would be OK.


True. But I still can't help being surprised about the results. Maybe, thats because here in Germany, suggesting software patents has almost become non political correct. At least, thats my perception.


If the potential home buyer wanted to keep their monthly outlay at $1700 whether they rent or own, then I think in 5 years if they lose $50k on their house they are likely out more money with a bigger headache than renting would've cost them.

Without property tax, $1700/mo means taking a 25 year loan of $320k at 4% interest. Obviously whether this is too high or too low depends entirely on where they live, but again I'm assuming that they were getting a similar place for roughly $1700/mo in rent, and are comfortable in that lifestyle. That's also assuming they have the cash for a down payment ($65k if they want to put 20% down).

So, if you run the mortgage numbers, in 5 years, assuming interest rates stay at 4%, they will have paid $61k in interest. Property taxes over 5 years (where I live, anyway) on a $385k home would be $15k-$20k on top of that. If we factor property tax into the $1700/mo, then they can only afford a $265k loan, and only pay $50k in interest over 5 years.

Either way, $50k or more interest + $18k in property tax + $50k in depreciation + $15k in closing and real estate fees costs them a lot more than renting for the same time period, and that's assuming that nothing goes wrong. If their roof starts leaking, the deck collapses, they need to move for work or their growing family in less than five years, they are in an even worse situation. (One might suggest we should amortize the $15k in closing and real estate costs over the 5 years and buy a smaller place accordingly, but at that point your $1700/mo rent becomes more like $1150/mo in mortgage payments, so at that point the question becomes how out of whack the renting and buying markets are, assuming again that you want to maintain a similar living standard, and not be downsizing as you're turning 30.)

The obvious question then is how likely "even if your house will depreciate $50k" ends up being.


At the risk of inflaming some - if you were buying the home, a paper loss in market value is irrelevant unless you are planning on selling the home during that time.

The real problem here is that we're not really talking about home ownership or buying a house - we're talking about taking out huge loans from banks and paying them off over decades, along with a ton of interest, and hoping that property values increase so we can trade up to something bigger and end up in exactly the same situation- rinse, repeat. The real problem is that we discuss "ownership" of homes when we don't really own them - the bank does.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: