Entries Tagged 'semantic web' ↓
June 12th, 2009 — semantic web, social nets, virtual worlds
I attended the Friday session of Metaverse University 2009 at Stanford last week. Here are some of my observations:
Themes: interoperability, open source, simulations, visualizations, breaking down the walls, and being stuck with Second Life. Little emphasis on chat and social networking, per se. Much more emphasis on architectures & component solutions.
Trends from Virtual Worlds Roadmap: simulation & training, health care, augmented tourism, mixed-reality museums, live sporting events in VW’s, virtual meetings.
Overview:
While the hype over virtual worlds has faded, many serious researchers continue to do fascinating work in the territory. Monetization of a good VW strategy is still needed but this goal seems to have receded into the future for many of the speakers, as well as many of the enterprise-scale companies investigating these spaces who seem less interested in making money (either through direct development and monetization or by riding the public hypetrain) and more interested in gaining efficiencies and trimming overhead (teleconferencing, remote collaboration).
Google (O3D), Intel (Cable Beach), Sun Microsystems (Project Wonderland), Samsung (Virtual Worlds Roadmap), and Nokia (supporting REalXtend) were all present, as well as many Stanford researchers, including the folks building Sirikata. Many are working to extend the OpenSim fork of the Second Life platform. None of them seem to be working towards direct productization (though Google wants O3D to be the in-browser standard for 3D content) but each were working to advance the platform and explore future possibilities.
With monetization off the table and money drying up, researchers are moving to embrace open source solutions (OpenSim, ScienceSim, Ogre3D) and pushing for open standards (OpenId, OAuth, XMPP) and flexible API’s. Almost everyone mentioned a desire to move away from the proprietary walled-garden approach towards an integrative one that looks to the success of social network strategies. While celebrating open source development of Second Life forks, almost everyone bemoaned being stuck on the platform, often underscoring the feeling with a groan that “there’s nothing else”.
Authoring was rarely addressed with content instead being re-purposed from upstream solutions, eg using 3DSMax & Maya content to build world content. Collada was uniformly mentioned as the exchange format. Most developers still want to shoehorn other modalities (eg PowerPoint, web browsing, document collab, etc) into the VW space. Some examples inadvertantly showed the clunkiness of current solutions. I asked why a technology like PowerPoint is any better in 3D than in 2D, eliciting a long pause from the presenter. There’s still a lot of ambition on the part of developers but not always a ton of common sense.
However, IBM’s manager of service design and service systems research, Susan Stucky, gave me the most reasonable answer I’ve heard yet about why it’s important to move 2D modalities into 3D. She said that for collaborative telepresence it was very helpful to have access to everything you would normally have access to in a meeting. Speaking with her at the break, she told me how IBM has found that the greatest use of their Second Life investment has come from the ability to bring employees and clients from around the world together into a collaborative space. They’ve held conferences, run meetings, and explored simulations of project management strategies. For her, the ROI was gained by telepresence & simulations.
And for me, I had a breakthrough speaking with Susan. One of the most compelling yet least-obvious values of collaboration in virtual worlds is the sense of embodiment conveyed by the presence of the avatar. Identity, social cohesion, team building, and friendship arise more naturally when those engaged are perceived as physically present. Self-awareness and the projection of self onto others is still quite bound to our physical bodies. Perhaps combining the embodiment of avatars with in-world access to knowledge & productivity tools represents a more effective modality for non-local collaboration. I’m not sure how this compares to video teleconferencing but I feel there’s a lot of depth to be explored in how virtual embodiment reinforces social cohesion & collaboration (attn: PhD candidates).
Other notables: Henry Lowood (Stanford Curator of History of Science, Media, & Genetics) speaking on The Ultimate Archive: building virtual museums of virtual world platforms inside virtual worlds (eg a virtual museum with a room that lets you play the first Doom level as it was originally). He noted both “perfect capture” (all the data can be archived) and “perfect loss” (experiences, emotions, and deleted content cannot be captured) in VW archiving. Sheldon Brown (Center for Research in Computing in the Arts, UCSD) showed his mind-bending work Scalable City and called for procedurally deriving world assets and behaviorally deriving world experiences.
Analysis:
Virtual Worlds have lost funding and are presently in the Valley of Hype. Effective monetization strategies have yet to reveal themselves. However, there is value to the enterprise in leveraging virtual worlds for telepresence and collaboration, simulation & training. The VW community is moving the R&D towards openness: open source components, open standards, interoperability, and engaging with the platforms and principles of social networks to enhance connectivity and move away from the Walled Garden. The most interesting work with virtual worlds continues to be in the deeper realms of behavior, psychology, telepresence, and simulations. Graphically, everyone is apparently stuck in Second Life. A smart, well-funded private investor would build a platform with the competitive graphics capabilities (surface mesh, brep, kinematics, HLSL, etc), a powerful and scalable object model that can push to XML/RDF/RSS, a powerful simulation engine with an expressive visualization/analytics front-end, a REST/JSON API capable of talking to agents, tools, and other VW’s (as well as Twitter, Facebook, LinkedIn, SMS, Playstation Network, XBox Network, etc), integrate ActiveX embedding of 2D tools (Office apps, browsers, etc), enable a content marketplace built around highly expressive and personalizable avatars and fetish objects, and cultivate a 3rd part service ecosystem supporting all of the above.
Is this so hard?
February 4th, 2009 — dataviz, semantic web
The New York Times will not go quietly into the dark knight of new media. Amidst constant rumors of the death of traditional news, the much-respected industry stalwart is moving quickly to build a compelling and forward-looking solution that redefines “the newspaper as platform”, as ReadWriteWeb’s Marshall Kirkpatrick notes.
Today the NYT announced that 2.8 million articles will be exposed to the digital world through the site’s API, allowing anyone to link, annotate, mashup, and crawl the data for meaning. This opportunity to construct data visualizations that abstract patterns & trends from within the articles is perhaps the most interesting element that immediately adds human value to what is otherwise an overwhelming amount of information (2.8 <i?million articles).
The recent Twitter Superbowl visualization, as well as other visualization experiments at NYT.com, are indicators of how the company is gathering data and parsing it in meaningful ways. A list of Twitter posts related to the Superbowl is just a long index table. Even reading the Summize Search feed for such a huge event is dizzying. But a geo-located, timeline mashup of tweets & key terms with a map of the US is immediately valuable to anyone trying to get a bead on trends. Their implementation is simple & entertaining, and you can derive substantial meaning at a glance.
These experiments are proofs of concept that point the way towards more advanced viz mashups now further enabled by opening the NYT information archives and building a coherent API on top. Imagine, for example, sections of NYT.com dedicated to serving all outgoing comm from a particular region, say Gaza & Jerusalem. Imagine seeing seeing real-time visualizations of the thoughts and feelings of average citizens free from the carefully structured statements of the vested power interests hurling rockets and armor at each other. Or imagine crawling the news reports of the last 8 years looking for instances of the words “Bush”, “Abramoff”, and “Florida flight school”…
Of course, this is another big win for the information transparency movement – information wants to be free, after all – and you can expect many others to get the message and follow suit. But it also wraps the current events of our world as reported by NYT in a searchable and re-configurable layer establishing a protocol for interfacing with these vast data stores. This open approach certainly cries out for some sort of semantic layer and I suspect the Reuters/Calais folks are paying very close attention to this announcement.
This is the prevailing trend of this current phase of digitizing culture and communication. Data is accumulating at an ever-increasing rate requiring open standards for archiving, interrogating, & visualizing the meaning held within. The tools are evolving to sort the signal from a vast sea of noise. More and more information archives will be exposed and more and more tech will be created to interface with it and draw out meaning from the morass. The global sharing of information and communication is feeding the pool of innovation that continues to radically alter the face of our world with each new discovery.
Whether or not information wants to be free… We certainly need it to be.
January 14th, 2009 — semantic web
My thoughts submitted to the Adobe Reader Blog for the post Take the Adobe Reader Survey. As a former Adobe employee who worked on Acrobat & PDF I have a lot of personal interest in seeing the format grow and evolve.
The growing public perception is that PDF is too bulky and increasingly too opaque for the networked world. This is because PDF’s have not kept up with the prevailing trends of transparency, findability, and collaboration. PDF is important as a container with certain rights & privileges (DigSig, Security, Markup, Forms), but the data inside a PDF is far more important. Currently, PDF’s are way too opaque, too bloated, and do not clearly convey value to most users. This is especially true on mobile (why would I chose to view PDF on mobile if not required by an enterprise I need to engage with?). For most enterprises and customers, PDF is a cloud of data more than a display standard. It’s value is no longer in consistent display of fonts and formatting. It’s in the data within the millions of PDF’s that the IRS has, for example. Even as a Forms front-end it’s difficult to see why Reader/Acrobat is a better solution than a robust customizable Flash interface. The Flash-based Portfolios feature is a step in this direction.
How can Reader add value to the massive volumes of archival PDF that already exist? Answer: 1) replace Reader with a robust, customizable Flash front-end, and 2) engineer semantic data* into new & existing PDF’s so that cloud agents can sift through the documents and return meaningful results. Both of these strategies should focus heavily on supporting Live Cycle for both distilling and evaluation of PDF’s.
The static viewer model is dying. People need to be able to search, sort, find, annotate, and share. Reader is already too heavy to be of value in a browser, much less on a mobile device. Any mobile solution must dis-aggregate formatting from data and be able to dynamically reconfigure the display to present only the important data/form elements to the mobile user. At the very least, PDF’s need some serious reformatting before they can be of any real value on the mobile platform. There’s just not enough real estate. Furthermore, any PDF-mobile solution must begin with the realization that mobile = personal, collaborative, locative.
If Adobe doesn’t do this, you can bet there will be lucrative opportunities for others who understand that the value of data is no longer in it’s formatting. It’s in accessibility and structured reporting. Frankly, any business intelligence solution that doesn’t address the growing heap of PDF’s lying in their servers will fail to really leverage their own data effectively.
* I think I’m starting to use the term “semantic” a bit loosely. Essentially, I’m suggesting that Acrobat should engineer active creation of RDF structures inside PDF COS and as header info. PDFLib should extend to support both writing & reading of this framework. Likewise, top-down text analysis should spider both doc text and COS to construct relevant metadata (RDF & taxonomies) written into the PDF file header. The point is to make PDF’s as transparent & searchable as possible to those actors & agents with access rights.
January 9th, 2009 — semantic web

The folks over at Twitchboard.net have the right idea. From their site:
TwitchBoard listens to your twitter account, and forwards messages on to other internet services based on what it hears. Our first service will automatically save any links you tweet to the del.icio.us bookmarking service. We’re working on connections to many other services — stay tuned!
This simple tool is a software agent built on the web platform. It lives on a server as a script watching your personal datastream – Twitter, in this case. The initial service notices when you have put an url in your tweet, grabs it, and passes it along to your del.icio.us account as a bookmark. It effectively concatenates two web services together to optimize your workflow and eliminate the need to double post. It extends the function of Twitter to include the function of Del.icio.us recapitulating the phylogenetic imperative evolving from unicellular function to multicellular. Twitterl.icio.us!
Twitchboard represents the emerging class of cloud agents that will help us sort and search the massive volumes of data we interact with regularly. Our connections are getting too dense and the data we’re working with is growing far too big for us humans to handle manually. We need subroutines customized to our interests, affiliations, businesses, and collaborations that can do the heavy data lifting for us while we focus on the meaningful expressions these agents will create for us from the noise.
Increasingly we’ll have swarms of such agents running across our digital lives doing our bidding and the bidding of numerous marketing and security agencies as well. These tools will have particular value across the enterprise where they will monitor workflows & financial movements, gather market data from clouds, and sift through productivity metrics to formulate valuable business intel. Agents will tell us about our lives and our health delivering colorful abstracts with pretty animated datasets showing how much we drove this week, how many miles we walked, tasks completed vs. outstanding, and much more feedback based on an array of scripts & sensors.
Twitchboard is using the fertile comm grounds of Twitter and it’s API to watch the datastream for keywords that can drive additional services. You can bet they’re also deriving all sorts of interesting meta-patterns from the Twitter feed that will be plugged into further modular mashups and visualizations. Through it’s popularity and the openness of it’s API Twitter is lighting a roadmap towards the semantic web. Groups like Twitchboard are building the services reading the machine web and helping us better manage the mountains of data piling up, meanwhile giving rise to a class of autonomous agents moving across devices, sensors, cameras, and clouds.
[Kudos to Sarah Perez of ReadWriteWeb for mentioning me & this post in her column!]