09.20.07
Semantic Web, please go away
I’m tired of hearing how the semantic web is the big thing, the meta-web, the solution to all things. It’s not. It likely will never get off the ground, and if it does, it will be of very little use. Here’s why.
Read-write has an article about how the semantic web failed so far and how should it change to gain traction. I largely agree with it. While eggheads support the semantic web, real people are getting things done without it.
The semantic web cannot succeed. I will use Read-write to prove it.
Read-write uses three HTML meta tags for this article written by Alex Iskold. One is author, which is listed as Richard MacManus. The other is description, which has a Digg link, hardly a description of anything. The third is copyright, which goes to MacManus, but not Iskold. I’ll trust the last is correct. This is typical of the web.
Unlike most semantic web supporters, I have crawled the web for 4 years, now. The first thing you learn when crawling is not to trust servers. If it says UTF-8, don’t trust it. It says it is written in Chinese, you better test it against a real language detection module. The same is true for Read-write, here. Two out of three fields are incorrect due to copying by machines.
The semantic web suffers from malice. We have a long tradition of keyword stuffing, cloaking, gateway pages and other malicious devices. Web spammers will only use the semantic web to spread disinformation. It may list Brittney Spears ringtones in the description, but the page offers viagra.
The semantic web requires a network to exist before anything else starts. Crawlers and indexers will not spend time parsing semantic web information until there is a critical mass of correctly formated pages. Site owners will not move until the indexing makes use of the semantic information.
Usually when someone trashes the semantic web, microfomat goons come out of the woodwork to sing its praises. Yet, microformats suffer from the same fraud and incompetence as the semantic web. Plus the network has to appear out of whole cloth.
So what is good, if microformats and the semantic web are doomed to failure? Freedom. Someone should have the freedom to create whatever page he wants using any method (Flash, JS, AJAX), and guys like me will have to divine its purpose. No one should have to try hard to describe a page’s purpose. Finding the purpose is the hard work that keeps me employed.
James said,
September 21, 2007 at 2:00 am
Hi David! I just wanted to point you to my blog so that you can do some further reading on the topic of the Semantic Web. Unfortunately Read/Write Web’s Alex Iskold is not fully qualified to be speaking about the topic in such a matter-of-fact way. Trust me, you can’t wish the Semantic Web away simply because you don’t fully understand it. Getting your Semantic Web news from a Web 2.0 blog doesn’t help either. Cheers!
Tom Morris said,
September 21, 2007 at 2:56 am
You seem to be confusing the Semantic Web with metadata. Semantic Web development today is about laying translation over what we already have on the web. Take hCard - how many people lie about their phone number in an hCard. Nobody wakes up and thinks “I know! Let’s put up a web page claiming to be me but with the wrong phone number.”
RDF/Semantic Web approaches do not rely on trusting data on webpages. But I can, for instance, digitally sign the RDF data. You can have data that is digitally signed to my identity.
Trust is something we can build on top of the web. Hopefully my friends can trust e-mails that come from me a bit more than they can trust from viagra spammers. We can take this to the web.
Alex Iskold said,
September 22, 2007 at 5:20 am
Yes, David, I agree, talking about absolute correctness to the point where computers are capable of proofs is pretty absurd. Too bad people do not get it.
Yihong Ding said,
September 22, 2007 at 7:52 am
David,
What all of your concerns are true, but they still cannot kill the fact that people need a web of data. The current problem is not due to that the dream of Semantic Web is unreal. In contrast, it is due to that the majority of people don’t really understand the path towards Semantic Web.
Realizing Semantic Web is an evolutionary event but not just a goal attempt. This process is full of mutations and natural selection. And it also depends on the accumulation of lower quality web resources, with which the mutation to higher quality web resources could happen (the general Law of Transformation of Quantity into Quality). This whole process is much more complicated than a single human project even if it is as great as the Apollo Program.
Please watch my post about “Semantic Web: Difficulties and Opportunities” and also several of my articles about web evolution. I hope they would give you another thought about what Semantic Web is and how we can realize it.
cheers,
– Yihong
Rayne said,
September 22, 2007 at 4:24 pm
I know that the semantic web, as they’ve been talking about for so long, is dead in the water, with its buzzwords, marketing hype, and the shrinkwrap in which it has been offered.
However, if anything can make use of the semantic web’s features, such as microformats, it would be the wiki, since it (especially Wikipedia) has such a focus upon the aggregation of knowledge. RDF, which is a tool for knowledge representation, would only assist further in that aggregation.
If the semantic web is talking about any other web-based, non-wiki medium, then no, it’s impossible for it to get off the ground.
David Kellogg said,
September 22, 2007 at 10:02 pm
“Unfortunately Read/Write Web’s Alex Iskold is not fully qualified to be speaking about the topic in such a matter-of-fact way.”
Really? A user of the web is unqualified to write about stuff that he is supposed to use in the future. He backs his arguments with data, but does not sit in the ivory tower with you.
I must admit I am unqualified because I don’t understand the difference between OWL’s minCardinality and maxCardinality. Maybe you can enlighten me on this wonderful and easy to use standard.
David Kellogg said,
September 22, 2007 at 10:09 pm
“You seem to be confusing the Semantic Web with metadata.”
I am not confused about what metadata is, nor the semantic web. I used the metadata on Read-write (sorry Alex) to illustrate what really happens on the web.
“Trust is something we can build on top of the web.”
That is so quickly stated and so hard to realize.
David Kellogg said,
September 22, 2007 at 10:38 pm
“RDF, which is a tool for knowledge representation, would only assist further in that aggregation.”
There is nothing in RDF documents I have seen and written that could not be expressed in JSON or key-value pairs. RDF never caught on. I used it for my extensions, but I don’t feel the religious fervor of others.
Microformats are just an admission that the Semantic Web lost its war. I do wonder if microformats will die just because they are viewed as Semantic Web, Jr.
James said,
September 23, 2007 at 1:24 am
@David
Perhaps my words were too harsh, I didn’t mean ill on Alex. I simply don’t agree with what he said in his articles, and I tend to have strong feelings about this topic
Interesting that you would identify me as someone with the Ivory Towers, I always saw myself as a little more grassroots
James
James said,
September 23, 2007 at 1:30 am
>>Microformats are just an admission that the Semantic Web lost its war. I do wonder if microformats will die just because they are viewed as Semantic Web, Jr.
The Semantic Web is not waging war against any other technologies. It has no equal, and nothing is in line to supersede its vision. From the inside (in our bubble?) we can see that the Semantic Web is still making much progress.
Here’s my theory:
Let’s take Web 2.0 for example. The movement started, and then it got very popular, and in a while it will plateau and we will see where it goes from there. (following the Gartner hype cycle).
The Semantic Web got a lot of attention when we first heard about it. Hype grew, but unlike Web 2.0 (whose technology to implement the ideas already existed), the Semantic Web stack of technologies still needed to be fleshed out. It’s just not as simple as other steps in Web evolution. Hell, it’s a revolution.
I’ve been following the hype of the Semantic Web, and it’s growing. We were graced with a second chance in the hype cycle. Only this time around, we will have all the technology in place and the hype will have a place to go.
Don’t worry my friend, it’s coming.
Tom Morris said,
September 23, 2007 at 8:27 am
“buzzwords, marketing hype, and the shrinkwrap in which it has been offered”
What buzzwords and hype? All the buzzword and hype machines seem to be touting Ajax office suites and undefined ’social graphs’ - the people I see working on the Semantic Web mostly hang out on IRC, write code and hate buzzwords.
“I must admit I am unqualified because I don’t understand the difference between OWL’s minCardinality and maxCardinality. Maybe you can enlighten me on this wonderful and easy to use standard.”
OWL != RDF. OWL is a layer on top of RDF, and it’s completely optional. You don’t have to write an OWL ontology for everything. It’s as big a myth as the “committee” myth.
In most everyday instances, OWL is not necessary - and in most instances where OWL is used, the more complex parts of it are not used. So, for instance, Class and Object/Datatype Property are used a lot more than, say, cardinality restrictions.
As for minCardinality and maxCardinality - it’s quite simple. If you have defined a class, the min and max cardinalities are the minimum and maximum number of times an individual property can appear. An example: say we had a class called Car, and we wanted to represent the type of tires it has, we may have an object property called “fittedTire”. This would link to an object called Tire. As it is a Car, it cannot have more than four fitted tires, because it has only got four wheels. So the maximum cardinality is four. If it has more than four fittedTire relationships, something is wrong. The minimum cardinality is the opposite - if we want to say “this object must have more than X instances of this predicate” you use minCardinality.
This avoids the key point though - most people will never have to touch an owl:minCardinality restriction, just as most people who publish an RSS feed have never read an XML Schema Document or DTD (and, in the case of XSD, can probably sleep better at night without seeing such a mess).
“There is nothing in RDF documents I have seen and written that could not be expressed in JSON or key-value pairs.”
How about “Someone with the name ‘John’ knows someone with the name ‘Jane’, who knows someone with the email address ‘god@example.org’ who once kissed a vegetarian who she met in 1963″? RDF is great for representing graph structures because it *is* a graph structure. Because it uses URIs, it’s great for combining graph structures and querying them without having to write a lot of CRUD. Those URIs also allow predicate namespacing, meaning extensibility that key-value pairs can’t get to - and the ability to scale up to the size of, well, the Web.
David Kellogg said,
September 23, 2007 at 9:14 am
“The Semantic Web is not waging war against any other technologies. It has no equal, and nothing is in line to supersede its vision.”
It is a war, and why are you waging it on innocent web devs?
There seems to be no good natual-language programming language to replace COBOL, but it’s not the wave of the future. Just because there are no good replacements does not make it the future of anything. Some things are just too complicated and too hard to understand to succeed easily.
HTML is easy. A third grader knows right away what the <blink> tag does. This is not true in the Semantic Web world. No one really understands it, and elevator speeches for this technology do not exist.
David Kellogg said,
September 23, 2007 at 9:27 am
“RDF is great for representing graph structures because it *is* a graph structure.”
Is it really a graph, not a tree? I must be confused. I guess now XML is not laid out like a tree. Again, all of your wonderful information can be placed in MySQL or JSON using key-value pairs. No one loses. In fact MySQL has the added benefit over the Semantic Web in that it exists.
I love this grassroots talk. It’s a standard created by the W3C! The W3C is a bunch of old white men with too much time on their hands. They travel the globe scratching their chins at conferences that produce nothing.
Tom Morris said,
September 23, 2007 at 3:53 pm
Yes, RDF is a graph: “The underlying structure of any expression in RDF is a collection of triples, each consisting of a subject, a predicate and an object. A set of such triples is called an RDF graph” (RDF Concepts & Abstract Syntax document, §3.1)
The Semantic Web - as you define it - may not exist. RDF does exist. And it can represent tree structures in a much easier way than either SQL or key-value pairs can. And you can query it extremely easily using a simple, well-defined query languge called SPARQL.
“Semantic Web, please go away”. Sorry. It’s here to stay. People may redefine the Semantic Web to be just the bits of the SW vision they don’t like or don’t care about or are incompatible with their corporate worldview, but those using the technology are finding a great deal of use from it.
David Kellogg said,
September 23, 2007 at 8:26 pm
What’s so special about RDF? RDF is crap based on a flawed XML standard written by ivory tower occupants. I could write any RDF file in JSON, including your pedantic trios. RDF is one reason why Firefox is so slow and hard to code for. The really crazy thing is that your semantic cult only accepts XML, when any data format will do, including JSON and flat text files. In any case, my flat file will parse faster than your XML any day. XML has its place, but you place it everywhere.
As for graphs, you were not talking about this definition of a graph http://en.wikipedia.org/wiki/Graph_%28data_structure%29 .
I love how RDF shows the triumph of the Semantic Web, when RDF preceded the Semantic Web by at least 2 years. In other news, George Bush took credit for the Magna Carta, the Battle of Agincourt and the Declaration of Independence. Causality does not allow your faux triumph.
I hate to repeat myself. Where is your Semantic Web? Where can I download it to my iPhone. Can I use it on my Vista phone?
Tom Morris said,
September 24, 2007 at 3:19 am
What on earth are you talking about? RDF is a data model, not a format. You can represent RDF however you like. XML is only one of these - and not a particularly good one. Almost all the RDF I write is in Notation3 format - which is a light text format. Most SPARQL databases now syndicate RDF out in XML, N3/N-Triples and JSON. This is another case of a straw man.
Yes, RDF is a graph model. I’m not sure how you can deny it is. In English, an RDF triple set could say:
1. John knows Jane.
2. Jane knows Mike.
3. Mike knows John.
4. Susie is the mother of Mike.
5. Susie knows Richard.
Convert that in to a machine-readable RDF and drop it in to a graph plotter. The sum of a group of triples is a graph! We don’t just call it a graph to sound fancy. We call it a graph because it *is* a graph.
“I love how RDF shows the triumph of the Semantic Web, when RDF preceded the Semantic Web by at least 2 years.”
That’s funny. Tim Berners-Lee was talking about building a Semantic Web in 1994:
http://www.w3.org/Talks/WWW94Tim/
The first working draft of RDF was in 1997:
http://www.w3.org/RDF/#timeline
Even MCF, the format which came immediately before RDF is dated at 1996-7:
http://downlode.org/etext/mcf/
Conclusion: the Semantic Web as a concept has been brewing inside TimBL’s head for about the same amount of time as the Web itself. RDF is a result of that, not vice versa.
As for practical applications of the Semantic Web - have you tried Operator in Firefox? I can now go to Google Local, find a business, and in two clicks dial the number in Skype or automatically load the phone number in to my phone. This is all based on microformats at the moment, but newer versions are also supporting RDF standards like eRDF and RDFa.
Look at dbPedia - a huge subset of the data on Wikipedia in an RDF database that you can query using one POST request, or you can get queryable fragments back through GET requests.
The Semantic MediaWiki extension - just one of a new breed of semantic wiki extensions - turns wikis into RDF databases.
There’s databases like Geonames, MusicBrainz and more coming online everyday. It’s not at the point where everyday users can use it, but it’s certainly at the point where there’s value for developers.
The browser-based RDF browsers are maturing - especially Tabulator, which is now a Firefox plugin. It’s certainly not simple enough for my parents to start using, but it’s neat.
And there are venture-backed Valley companies in the SW space now: Metaweb (freebase.com), Cubicon (cubicon.org) and Radar Networks, off the top of my head.
David Kellogg said,
September 24, 2007 at 8:52 am
I googled “Local google semantic web” and the number one result I got was from a Google executive, titled, “Semantic Web Battles Incompetence“. He doesn’t mean Semantic Web cultists are incompetent.
“We deal with millions of Web masters who can’t configure a server, can’t write HTML. It’s hard for them to go to the next step. The second problem is competition. Some commercial providers say, ‘I’m the leader. Why should I standardise?’ The third problem is one of deception. We deal every day with people who try to rank higher in the results and then try to sell someone Viagra when that’s not what they are looking for. With less human oversight with the Semantic Web, we are worried about it being easier to be deceptive,” [Google executive Peter] Norvig said.
I find it fitting that Google independently beat me to the exact same arguments, because I think there is underlying truth to my points. The Semantic Web must be simple, or no on will use it. I work every day indirectly with incompetent or lazy webmasters. The Semantic Web is targeted towards people like me, not the hoi polloi.
Huuuuuge sites you have there. From alexa rankings.
cubicon.org: No data
freebase.com: 53,346
Operator in Firefox? No one even reviewed the thing, so how popular can it be? Also, it uses Microformats, the proof of the Semantic Web’s demise.
“Susie knows Richard”
We don’t need the Semantic Web to know these things. Ever used Plaxo, Facebook or Myspace? How did these ever grow without giving a hoot about the Semantic Web? Now that they own their own graph, what good is it for them to publish my information publicly? Do you think any of them store RDF internally? My guess is no, they used real databases. We have three of the largest closed social graphs out there, and they all need SQL or MySQL developers, not Semantic Web devs.
Again I ask, where is the Semantic Web?
Tom Morris said,
September 24, 2007 at 11:46 am
I’ve published RDF conversion tools for some large social networks - Twitter and last.fm. LiveJournal and other Six Apart products publish FOAF profiles. There is a lot of RDF data out there, but a lot of it is hidden behind corporate firewalls. RDF is used extremely heavily in the bio-medical field. I know that Renault, Chevron and Kodak have all used SW technologies quite successfully in enterprise data projects. There are various large SemWeb projects going on behind the scenes in both government and the private sector.
How exactly is microformats the proof of the Semantic Web’s demise? Pretty much everyone who’s involved with SW development thinks microformats are great, and are extremely useful for expressing non-domain-specific data on web pages. The GRDDL standard provides a simple, well-defined bridge between both general (ie. canonical) and domain-specific microformats and RDF. Most implementors of GRDDL parse microformats by default. There is a feasible limit on microformats - microformats are essentially conservative and web-wide. GRDDL allows the same kind of process to be useful in domain-specific places - in online communities, in industry-specific use cases etc.
As for Operator, I’m not sure what the reason is why there aren’t reviews. It is being rolled in to Firefox 3 though.
“We don’t need the Semantic Web to know these things”
No, but in some cases it makes things easier. Once you beat the somewhat daunting learning curve, there is a point where it’s extremely easy to use RDF technology to perform some complex queries on data. Think SQL database with hyperlinks.
The Semantic Web - as YOU define it - does not exist. That’s because you define the Semantic Web as being all the bits of some grand, overarching theory that’s either bound to failure or that has already failed. The Grand Semantic Web or whatever straw man version of the Semantic Web that you can dream up - yeah - it’s not happening.
The Semantic Web as a common platform of linked data does exist, and we are growing it - if only in baby steps. The Valley startups I listed are small. That was exactly the point. There are also huge Semantic Web projects - most of those are behind the corporate firewall. Have you seen the Gartner Hype Cycle? Well, Semantic Web technology is just slowly climbing up the slope of enlightenment thanks to technologies like SPARQL and GRDDL.
Where is the Semantic Web? Well, I repeat. Look at DbPedia. Look at Revyu.com. Look at LiveJournal. Look at GRDDL.
David Kellogg said,
September 25, 2007 at 8:44 am
I saw this in SOAP in the past 5 years. I saw this in push technology in the Nineties. Once it is known by everyone who could care about it, it shrinks. Semantic Web http://google.com/trends?q=semantic+web is shrinking in mindshare. The graph points downward.
For all the respect I have for Brad (formerly of LiveJournal) for excellent software, he misses the boat with the social graph. Here are its problems.
* It’s complicated.
* Too much cooperation is required.
* Large projects rarely get off the ground.
* There is no profit motive for sharing.
* There is a profit motive to deceive.
If I really cared about friends of friends, I would skip SPARQL entirely and feed it into MapReduce. And with MapReduce, I can handle so much more data more easily than a small federation of DBs.
And DbPedia, who’s search is written in .NET? (Server: Microsoft-IIS/6.0) That’s a strange technology choice.
So far no one has shown a widely adopted, large scale use of the Semantic Web, not even through LiveJournal’s nascent effort. The Semantic Web’s mindshare is really small and falling according to Google Trends. Now I question my title, “please go away”. After further research, I think it never really arrived.
terry chay said,
October 2, 2007 at 9:43 am
I love the Semantic Web.
It’s the biggest moving target. As soon as FOAF dies, it’s now XFN, no wait, it’s…
Fuck, one day, they might stumble upon something that people will use and then say, “Semantic Web was that all along.”
The Woodwork » Blog Archive » Web n point Oh! said,
November 5, 2007 at 12:34 am
[…] Long before Jason Caalcanis’s prank, Web 3.0 was supposed to be “the semantic web.” So when I received an invite to this talk, I had to forward it on to Dave, especially in light of this post. […]
Ric said,
July 13, 2008 at 8:59 am
Taken literally, semantics is just is the study of meaning in communication. Anything that adds additional meaning to your data is a step in the right direction. The ability to add tags to items is the extent of most sites’ semantic features, but this is better than nothing.
The way to get semantics off the ground is to take the burden of labelling away from the user …or at least make it very easy, or better still make it seem to the user like they’re not doing any work at all.
Data is near useless without context and meaning, so at Swirrl we’re trying to overcome the hurdles that the semantic web faces.
http://blog.swirrl.com/articles/2008/07/10/why-the-semantic-web-has-failed-to-get-off-the-ground