darma1's comments

darma1 · on July 25, 2012

Too much obsession over UI.

What people really "need" is the raw data.

If they want to make some UI that they like, then they can do it. If they want to offer this to others, they can do it.

If they want to load the data into some SQL database, they can do it.

If they want split the data into some other format using csplit and load into some other faster database, they can do it.

If they want to extract a portion of a raw file and just use agrep on that, they can do it.

The point is that UI is a personal decision.

Because some people do not like CL's bare bones UI doesn't give them the right to do anything. Because some people don't like bloated and clumsy web interfaces and prefer text commands doesn't give them the right to do anything either.

But people can't be stopped from making personal decisions about how to process data. Public data.

It's funny how some websites think they "own" data that is given to them. Do they "need" this data? Yes, they do.

darma1 · on July 25, 2012

If they go to court and get an opinion it should be really helpful as a guide for other entrepreneurs who see re-processing the information on the web in new ways as the foundation for their business.

You mean, like Blekko?

The simple, inconvenient truth is most folks who are making money from the web, like search engines, are not content creators (nor content owners), they are content publishers... who publish for free. "Are you a non-technical person who wants to get something onto the web? No problem. We'll help you with that, for free. Just give us some personal info about you so we can solicit money from advertisers."

(Placement, e.g., paid placement, where the eyeballs are more likely to see something, for a fee, is another matter.)

ChuckMcM · on July 25, 2012

In the case of search engines there is already a lot of case law from people suing Google of course. As with most things its a spectrum. Using a search engine as an example (and disclaimer I work for Blekko, a search engine) a search engine crawls the web, then it computes a number of parameters about the page its crawled (what its about, how many people link to it, who does it link to, Etc.) and creates a new piece of information called a 'rank'. Then when a search query comes in the query is used to create a way of recognizing a 'target' page and then the rank is used (and in our case slashtags too) to decide what pages you might be looking for. The results, also include snippets from the page to help the user evaluate whether or not the page is the one they want.

Now that use has generally not been highly contested, people want their pages to be found and so they tolerate search engines searching them. They can be explicit in what pages they want searched and which they don't using robots.txt. So that relationship is pretty well understood. People who ban Blekko (and presumably anyone else) from their robots.txt file are not crawled by us, we recognize and honor that it is there choice if they want to be in our index or not. On the copyright issue however it has been pretty clearly established that 'page rank', like someone's review rating on a movie or an application, constitutes an original work of the creator. There is a lot of experience with things like book reviews where the review, using snippets to illustrate the review, and a rating, are both fair use and the original work of the reviewer.

Google however got in trouble with their news aggregation service. And the bulk of much of the arguments there, were that the snippets were so complete on the news page as to exceed 'fair use' exemptions, and that by aggregating these pages they were 'stealing' traffic that might otherwise go to the news site. The results on those cases were mixed, with some newspapers being removed from Google's index, and others not. Generally everyone that was removed has since been replaced (at the request of the news source) because Google does drive more traffic to a web site than any other web service. So in this case while Google was found to violate the copyright of these news organizations by indexing their newspapers without their consent, the papers later found it in their best interest to give their consent.

Craigslist and Amazon and Ebay are a third kind of question. They are a collection of 'facts' (as many have pointed out) which are derived by a process (placing ads). And in the 'old' world the courts have generally sided with the person who had paid the economic cost for creating those collections. And as PadMapper and others before them have shown, is that there is a great temptation to use those same facts and re-package them into a new collection. This pretty naturally sets up a commercial tension between the original collector and the new user of those same facts. That seems to open another front in copyright litigation and policy. So if this court gets an opinion published it cannot help but be influential as there don't seem to be very many in this space. That could be because judges think the right answer is 'obvious' but I seriously doubt that to be the case.