The Times, They Are A-Changing
by Kimball Bighorse on June 30, 2009

So here at TechCrunch, Henry Work has been holding down the fort by himself since December apparently, and now has moved on to a new chapter in his life. So I’ve taken it upon myself to carry his torch of Crunchy dev-goodness. Actually, Hunter the intern is here as well, carrying the torch with me, at least for the summer. Here’s to Henry and the legacy he left behind. And here’s hoping that we can keep it going here at Crunchbase.

CrunchBase Gets A New Look
by Henry Work on June 2, 2009

After being behind the times for a few months, the CrunchBase logo (see above) now joins its peers across The Crunch Network with a new, sporty look. We’ve also updated the CB favicon, iPhone webclip, and the TechCrunch logo in the bottom-right-hand corner of the site. Lastly, we added the TC Network tab to the top of every CrunchBase page, so that you can more easily switch between properties in our network.

Thanks to TC Alumnus Mark Hendrickson for the original designs, and super intern Dan Romero for putting it all together.

If you’re an API developer and want to use CrunchBase art for your app, the files are available here for download.

New CrunchBase APIs: Permalinks And TechCrunch Posts
by Henry Work on June 1, 2009

If you’re a CrunchBase API developer, we have two new APIs for ya. These should make it easier to find CrunchBase entities, as well as match TechCrunch stories related to particular companies and people. We use these APIs ourselves for implementing the Company Index on TechCrunch, and so we thought these might be useful for other developers as well.

Permalink API

Ever needed to find a corresponding CrunchBase page for a particular Company/Person/Financial Org name? Well, we now have an API for that which is easier to use than our Search API.

Syntax

For all entities except people:

http://api.crunchbase.com/v/1/<plural entity namespace>/permalink?name=<entity name>

For people:

http://api.crunchbase.com/v/1/people/permalink?first_name=<person first name>&last_name=<person last name>

Examples

Returns

If the permalink is found, JSON will be returned in the following format (this one for companies):

{”name”: “Google”,
“crunchbase_url”: “http://www.crunchbase.com/company/google”,
“permalink”: “google”}

If a permalink is not found, the HTTP status code will be 404 and the JSON return will be (note that this is a 404 for companies):

{”error”: “Unknown company. Please see www.crunchbase.com/help/api for help.”}

Notes

The entity name (or for people, first and last name) is case-insensitive. Replace any spaces in the entity name with ‘%20′.

TechCrunch Posts API

Want data on how many times a particular company has been written about on TechCrunch? Well, you can do that now as well. The syntax is very similar to the Permalink API, above. You start with an entity name.

Syntax

For all entities except people:

http://api.crunchbase.com/v/1/<plural entity namespace>/posts?name=<entity name>

For people:

http://api.crunchbase.com/v/1/people/posts?first_name=<person first name>&last_name=<person last name>

Examples:

If any posts are found, JSON will be returned in the following format (this one for companies):

{”posts_url”: “http://www.crunchbase.com/company/google/posts”,
“num_posts”: 1174,
“name”: “Google”,
“crunchbase_url”: “http://www.crunchbase.com/company/google”,
“permalink”: “google”}

If no posts are found, the HTTP status code will be 404 and the JSON return will be (note that this is a 404 for companies):

{”error”: “Unknown company. Please see www.crunchbase.com/help/api for help.”}

Notes

The entity name (or for people, first and last name) is case-insensitive. Replace any spaces in the entity name with ‘%20′.

Questions? Comments? Head over to the CrunchBase API group.

The 2008 Year in Review gets reviewed
1 Comment
by Heather Harde on February 22, 2009

This past week, TechCrunch released its first research report, The 2008 Year in Review. The primary data source is CrunchBase, our wiki-style database of start-up companies, people and investors where we’ve been collecting data for the last fifteen months.

Jeremiah Owyang, a sr analyst for Forrester Research and general internet strategist, was one of the first media professionals to reach out and offer to give us a read Friday night. Within a few hours, I was both surprised and impressed to see his review of the research. It’s a very fair commentary. (Forrester has been at this for years.) Like many v1 TechCrunch products, Jeremiah is correct that the report offers a lot of valuable data and needs additional polish.

We’d love more constructive criticism about our latest research initiative. If you’d like a chance to review the 2008 Year In Review and share your recommendations, please contact us.

The CrunchBase Shot
by Henry Work on October 23, 2008

Here’s a cool usage of the CrunchBase API: The CrunchBase Shot. Snap Technologies is the company behind Snap Shots, the link enhancer that ’shoots’ you to webpages without actually leaving the page that you’re on.

Now Snap recognizes links to CrunchBase, and thanks our API, creates a custom view template that looks quite nice. Snap serves a ton of traffic every month, and so it’s great that they can provide a clean looking interface for CrunchBase entities.

We love seeing our API being used in the wild. Please drop us a note on our API Google Group if you see it anywhere!

First Ever CrunchBase Data Mob In Austin: Tues Sept 23rd
2 Comments
by Gené on September 22, 2008

A bunch of Austin entrepreneurs and CrunchBase fans are hosting a Data Mob to highlight Austin startups and get their data into CrunchBase.  Very cool. If you want to set one up in your city let us know in the comments of this post.

—————————————-

Join us for the first-ever Austin Data Mob!

On Tuesday September 23rd, we will gather for free beer, soda, pizza and Tiff’s Treats at Conjunctured to enter information about Austin companies into CrunchBase en masse.

What is a data mob?

Data mobs have been most successfully implemented by Freebase, a semantic database company located in San Francisco California. It’s a fun way to bring offline information online – and in so doing move us a little bit closer to the idea of a global knowledge commons. Data mobs are also just a great excuse to hang out, listen to music, eat, drink, and learn about cool new companies.

Why all the effort?

CrunchBase is an open-source database tracking technology companies around the world, and as such is widely used and cited. CrunchBase was developed by TechCrunch, and specifically Henry Work and his team of merry Ruby pranksters. Henry has also been so kind as to give us a direct-access account that will push our new data live to CrunchBase immediately (and with great power comes great responsibility).

There is a lot of amazing stuff going on in Austin that doesn’t get the recognition that it deserves.  This is one of many initiatives to heighten Austin’s national profile and promote the great community of entrepreneurs and technologists who reside here. We want new companies to, as John Erik is wont to say, “Ditch the Valley and Head for the Hills”.

We are Austin, and we are Startup District.

Okay, gimme the deets:

Date: Tuesday, September 23nd.
Time: 7:30 p.m. – ’til
Location: Conjunctured, 1309 E. 7th Street

Sponsors: Porter Novelli, Conjunctured, Austin Startup, and Moximity.

RSVP on Facebook:

http://www.new.facebook.com/editevent.php?eid=39754791150#/event.php?eid=39754791150&ref=mf

Or Upcoming:

http://upcoming.yahoo.com/event/1108005/?ps=6

The API Gets A Logo, We Talk Stats, We Show Off Apps
1 Comment
by Henry Work on August 29, 2008

OK, ok, I know it’s lame to cross-post.  So here’s the skinny: our API got a new logo (isn’t is sweet? Thanks Hendrickson).  We’re releasing our API stats from its first month of usage (also sweet).  And we’re showcasing a few really awesome products now using our fine API (definitely sweet).  That’s all you need to know.  But you should go check out the article on TechCrunch I labored for almost 12 minutes over, complete with griping title: Some CrunchBase API Stats and Apps.

Building A Semantic Web: Interview with Benjamin Nowack
3 Comments
by Rob Olson on August 26, 2008

Benjamin NowackLast month when we released the CrunchBase API, Benjamin Nowack came to our attention when he developed Semantic CrunchBase, a RDF/SPARQL interface to CrunchBase. Since then he has remained an active user of the CrunchBase API and last week released a Twitter bot that responds to commands with CrunchBase info.

Nowack runs a small web agency that focuses on combining mainstream website creation with Semantic Web technologies. In addition, he works as a contractor for early adopters in that area and maintains an open source RDF toolkit for LAMP environments. Through his efforts he hopes to get the SemWeb agency market get off the ground.

For us, the Semantic Web is terra incognita. Eager to find out more about it, we contacted Nowack and asked him a few questions about Semantic CrunchBase and the Semantic Web.

CrunchBase: When we released the CrunchBase API, you were one of the first developers to step up and quickly released Semantic CrunchBase. Can you explain what Semantic CrunchBase is and what inspired you to create it?

Nowack: The graph-shaped CrunchBase data is ideal for showing that there is more (or rather *less*) to the Semantic Web than “AI on the Internet”. One of its core benefits is simplified data repurposing, plus the ability to extend applications at run-time. For Semantic CrunchBase, I’ve created machine-readable descriptions of all CrunchBase items, and also machine-readable links between related items (This process could be fully automated, thanks to the nice design of your API). Once we move from a Website of linked *pages* to a graph of linked *data objects* (and crunchbase.com is already pretty close), lots of new possibilities arise. Semantic CB allows the CrunchBase dataset to be explored and filtered using a faceted browser, there is a SPARQL endpoint for arbitrary graph queries, and a tool to define custom API methods which can integrate related Web data (such as the job feed from CrunchBoard, or dbpedia, a SemWeb version of Wikipedia).

CrunchBase: Do you know of any apps that are using Semantic CrunchBase to enhance their functionality?

Nowack: Only a few experimental ones. There was a short thread on the mailing list about using the SPARQL endpoint to extract social graph fragments from CrunchBase. SWSE, a semantic search engine, is experimenting with the data created myself is a Twitter bot that can answer questions such as “Founder of Flickr”.

CrunchBase: You have been immersed in the Semantic Web movement for a while now. How did you first get interested in the Semantic Web?

Nowack: It was a trap! I was tricked into this whole SemWeb stuff in 2003 when I was looking for a topic for my diploma thesis. I read TimBL’s Weaving the Web where he explains the Semantic Web idea, and it all sounded like a great area to explore. However, there were hardly any toolkits for mainstream coders back then, so I started to write my own. And it took a while to realize that there is absolutely no need to implement all the specifications the SemWeb community comes up with every month. After figuring out which technologies to use and which ones to skip, I got pretty excited about RDF for website development, especially for small development teams.

CrunchBase: Can you put into layman’s terms exactly what RDF and SPARQL are and why they are important? Do they only matter for developers or will they extend past developers at some point and be used by website visitors as well?

Nowack: The basic ideas behind the Semantic Web are increased content granularity and repurposing of Web data. The goal is to move from a Web of documents to a Web of information items. And with the Resource Description Framework (RDF), we can do just that: Describe things in a more reusable way than with plain HTML, and let software utilize this “High-Resolution Web” (as Twine’s founder Nova Spivack likes to call it). RDF comes with a couple of own data exchange formats (XML and JSON, among others). The essential parts of the framework, however, are a simple, unifying data model (which by the way allows the integration of RSS, Atom, microformats, or other typical Web 2.0 information sources) and a query language, SPARQL. SPARQL is like SQL for the Web. Instead of tables, it joins (possibly distributed) resource descriptions. Think of a database-like interface to the Web. SPARQL also provides a standardized protocol, which enables something we could call “Mashup chaining”: the ability to build on the value created by other mashups, successively. RDF and SPARQL make it almost trivial to open enhanced data to other apps.

RDF and SPARQL are developer-oriented, they should not be exposed to non-tech website visitors directly. Their portability and flexibility *can* be passed through to the UI to a certain extent, though. For example, all filtering options in the faceted browser at Semantic CB are generated by SPARQL operations. These user-driven queries could possibly be ported to another dataset, or a different UI (which is what the Twitter bot is basically doing). Another example is the collection of resource descriptions (similar to RSS), where a website visitor could import or subscribe to very specific data objects. Users of the Operator Firefox plugin can do some of these things with microformats or RDFa (an RDF-in-HTML syntax) already today. I did some tests with a semantic clipboard some time ago. It worked, but introducing new UI patterns is not trivial. For end-users, I don’t expect in-your-face RDF and SPARQL anytime soon.

CrunchBase: On your website you wrote that “RDF and SPARQL as productivity boosters in everyday web development”. Can you elaborate on why you believe that to be true?

Nowack: RDF with its generic data model supports “data first” approaches for Web development. There is no need to define a model or database tables in advance, you can directly start with the app’s UI. The only custom things I needed for the initial Semantic CB were a parser for the API’s JSON, a theme for the site, and HTML templates for the resource views. (Well, and a server, but that’s another story.) Once I had a working prototype online, I could extend the system based on early feedback, without touching the database structure, and at run-time. The data model simply evolves with the app. And with SPARQL, you can access your data more easily than with SQL. The syntax is simple, you don’t have to worry about complex table joins any more (because querying is done on the graph, not on the storage level), and you can always export and reuse the aggregated information, should you want to. RDF is mainly marketed to domains such as Life Sciences or Enterprises, but I personally think there is an equally large potential for Web agencies and startups where a reduced time-to-market affects customer satisfaction and success. Some people have started work on an RDF toolkit for Ruby, it could be interesting to see that combined with an agile framework like Rails one day.

CrunchBase: In his definition of Web 3.0, Nova Spivack proposes that the Semantic Web, or Semantic Web technologies, will be force behind much of the innovation that will occur during Web 3.0. Do you agree with Nova Spivack? What role, if any, do you feel the Semantic Web will play in Web 3.0?

Nowack: I’m not a fan of version numbers (TimBL would probably consider the Semantic Web as Web 1.0, as it’s close to his initial vision). But in the context of continued progress (the time after centralized social networks, incompatible data portability “standards”, and overly generic RSS feeds) I agree with Nova’s statement. Semantic Web technologies enable flexible remixing of information on the Web. When we waste less energy on the “how”, we can put more focus on the “what”, try more things at lower costs, and accelerate (and even distribute) innovation. The RDF community has still some work to do with regard to attracting (and listening to) the larger Web community. But many specs and toolkits are still evolving and pragmatic contributors are clearly welcome.

Thank you to Benjamin Nowack for taking the time to answer our questions.

Track Changes With The CrunchBase Edit Timeline
by Henry Work on August 21, 2008

One of the biggest pieces of feedback we get on are people wondering what’s happened to their edits they’ve made.  And we’re the first to admit: our user edit process still has a long way to go. But today we’re happy to announce a nice new way of keeping of the queue of edits made to the site, including your own: The CrunchBase Edit Timeline. Great title, right?

The timeline will show you a reverse-chronological, paginated list of all the edits made to the site (all 77,047 of them). We’re also flashing some edit stats on the sidebar; there have been 581 edits to the site today, 1852 this week, and 9431 this month (thanks to the ActiveSupport CoreExtension Calculations for making these easy).

Another good way to find out when your edits get applied is to subscribe to a company (or person or any other entity) RSS feed. Check out this earlier article on revision RSS feeds.

Rails link_to Weirdness Inside Namespaces
4 Comments
by Rob Olson on August 21, 2008

On our TechCrunch Elevator Pitches site we have an admin interface that lives inside of a “admin” namespace. The route declaration in routes.rb looks like this:

map.namespace(:admin) do |admin|
  admin.resources :videos, :member => {:update_status => :put}
  admin.resources :comments
  admin.root :controller => "videos"
end
 
map.connect "logged_exceptions/:action/:id", :controller => "logged_exceptions"

We also use the Exception Logger plugin to track exceptions. So we have a route for that.

I ran into trouble in the admin/videos/index.html.erb view when I attempted to link outside the admin namespace to the logged_exceptions page. This is what I was trying to do that doesn’t work:

<%= link_to "Exceptions", :controller => "logged_exceptions" %>

The code above creates the following url: http://foo.com/admin/logged_exceptions. But what I needed is this: http://foo.com/logged_exceptions. This problem has not come up before because we normally use named routes which will resolve to the correct route regardless of the current namespace.

The solution is really simple but took me a while to figure out. To explicitly direct Rails to look for the controller at the site root place a “/” before the controller name. The correct link_to statement looks like:

<%= link_to "Exceptions", :controller => "/logged_exceptions" %>

Hopefully this helps anyone in the same situation.