en-de
en-es
en-fr
en-pt
en-sl
en
en-zh
0.25
0.5
0.75
1.25
1.5
1.75
2
Quantity vs. quality on the consumer web
Published on Jul 07, 20114558 Views
Do you like content farms? You are helping them succeed. Every improvement in open structured data, machine learning and NLP gives rouge web players an ammo in the battle against search engines and us
Related categories
Chapter list
Quality to Quantity to Quality on the Web00:00
Topics02:11
We are all trying to organize the web02:33
Making it right02:46
Picture - 102:56
Picture - 203:01
Not so long time ago03:14
some other people03:18
are trying to do the opposite03:19
trying to disorganize03:22
Picture - 303:29
using the tools we have built03:44
Picture - 403:50
motives03:54
it is about profit03:59
Profit04:01
So, why do I care?04:26
Job opening04:33
And why might you care?05:01
What do we do at Zemanta05:27
Zemanta05:33
Example - 106:21
Example - 206:57
Example - 307:31
Opening up the hood - 107:43
Opening up the hood - 207:56
Opening up the hood - 307:59
the reality - 108:11
the reality - 208:14
How it works08:33
Main design goals09:11
Analysis pipeline - 110:42
Analysis pipeline - 211:03
Background knowledge - 115:31
Background knowledge - 217:39
After analysis18:24
Solr18:53
Lucene plain “More Like This”20:16
Metrics & tests20:32
Overview21:00
So mash-ups happen...23:34
Offering services24:18
now back to the bad guys - 124:39
now back to the bad guys - 224:41
Job opening24:47
There's more than meets the eye25:15
Diagram - 125:27
Diagram - 225:57
Finding their keywords, niches26:05
The Register26:20
The sophisticated part of the market26:35
Diagram - 327:06
Find / create content27:11
Articles27:25
Cover your tracks28:23
T һiѕ iѕ nοt the text you аre lookinɡ for. - 128:34
T һiѕ iѕ nοt the text you аre lookinɡ for. - 228:50
Translate it to random language and back to English29:08
Covering their tracks30:16
Blog Articles30:44
Use Zemanta, OpenCalais Cover your tracks to add tags, images, links31:04
Spammers say darndest things31:08
TagPiG31:25
Pull additional content from Freebase32:12
Demand Media32:23
Remixing linked data and spam32:56
Diagram - 433:10
Publish 33:13
Diagram - 534:29
Valuable comments34:35
Diagram - 635:20
Profit? 35:23
Search engines to the rescue?35:52
Ecosystem36:20
Food for thought36:44
think reCAPTCHA37:01
Could article directories be fruitfully used?37:16
Find rewritten articles and use them as parallel corpuses?37:29
Could we use global workforce market more efficiently to get more linked data?37:46
Thesis, antithesis, synthesis?38:40