Monday, September 3, 2012

Ehow's Spark; Undetectable, Unstoppable.


Meet the deadliest, stealthiest crowdsourced content scraper yet;
Ehow.com's Spark.


"Pinnablebusiness.com" - a website devoted to uncritical fawning over Pinterest - warns us all about the evil new crowdsourced content scraper at ehow.com called Spark.

The author gushes about Pinterest to an extent that even the negatives are spun into tiny silver linings.
When Pinterest outranks you for your own content, it can be a good thing if you would otherwise not rank on page one or two of search engine results for a particular search phrase.
I recommend reading of the whole article if only to chuckle at the contradictions of Pinterest=good and Spark=bad for doing the same thing.

The author is right in that this new entity makes Pinterest look like a children's choir. Ehow's Spark is entering the content scraping field like a nuclear pirate ship sailing into a koi pond. Isn't this quote from their TOS positively terrifying?
...you hereby grant eHow a worldwide, royalty-free, freely transferable, freely sublicensable (through unlimited levels of sublicense), non-exclusive license to use, reproduce, modify, transmit, distribute, publicly perform and display (including in each case by means of a digital audio transmission), and create derivative works of the User Generated Content, in any form, media, or technology now known or later developed. You also hereby waive any moral rights you may have in such User Generated Content under the laws of any jurisdiction. You hereby appoint us as your agent with full power to enter into and execute any document and/or do any act we may consider appropriate to confirm the grant of rights, consents, agreements, assignments and waivers set forth in these Terms. You agree that we may (but are not obligated to) display your User Generated Content, and your user name or your actual name (according to the preferences you select at the time that you register) along with your User Generated Content.
Waiving moral rights? MORAL RIGHTS? And user-generated content? No... this is scraped third-party content, let's call a spade a spade. Ironically, ehow is particularly protective of "its" content:
Site on your computer for your own personal, noncommercial purposes. eHow reserves all other rights in the content on the Site, on its own behalf and the behalf of its licensors (including contributors), and eHow does not, directly or by implication, by estoppel or otherwise, grant any other rights or licenses to you under these Terms. Except as expressly stated in this paragraph, you may not reproduce, distribute, modify, publicly perform or display, or prepare derivative works of any content on the Site without prior written consent from eHow or the third-party owner of the rights in that User Generated Content (if any).
You can find further horifying details from Ehow's Terms and Conditions here.

And now from some obligatory cutesy lingo: Spark users aren't pinning; they aren't loving; they are clipping.

Scraping from Google Images is expressly encouraged.
Looking for inspiration?
Search Google Images
Spark's links are straight up, worthless nofollow links.

For maximum intrusion of your privacy, users can only login with Facebook or Google. The Spark people aren't playing; this is business.

All that infringed content is kitted up with a convenient PIN IT button to help the infringement spread like wildfire.

Their "pinmarklet" is a very, very special copyright infringement tool. Raise your glass to real innovation; it grabs the text, too, and the text's formatting. Crowdsourced copyright infringement without the boundaries.

On the go? Ehow's Spark wasted no time providing the volunteer content scrapers a mobile app:


Lo and behold, we do have yet another proprietary blocking tag to add to our collection in our ever swelling header field:

<meta name="ehow" content="noclip" />

In order to test how to best block Spark sitewide with .htaccess (image substitution is so much fun!), I have compromised my privacy and created a test account with Spark. I "clipped" images from my own websites, and examined my logs. It is with great alarm and astonishment that I must report ehow.com's Spark to be UNDETECTABLE, and therefore UNSTOPPABLE by any means other than their arrogant opt-out meta-tag. They might be accessing the images from what is already uploaded to the user's browser, rather than from the creator's website servers.

The worst one yet. And still... the very worst has probably yet to come.

1 comment:

Crunchy Data said...

Glad to see you covering this from your own POV. As the author of the article you cited, my aim was to get people talking about this new threat to artists and writers. I neither "gushed," nor "spun," Pinterest, rather noted a specific instance in which Pinterest could outrank an originator and still help that originator. This opinion is based on my own experience. That said, everyone needs to decide for themselves whether any site should be able to scrape their content. I think we all need to raise hell about this as a joint effort, and scraper sites--all of them--should be mandated to allow sites to opt in to their scraping, rather than forcing us to scramble to add code and warnings to our own TOUs. I would welcome you to guest post on CrunchyData.com, if you are interested. You can feel free to publicly disagree with me about Pinterest on my site, and maybe reach new readers who should consider every angle of the issue or even change my mind. (email through my site or just info[at]crunchydata.com. ;)