Executive Order about Open Data

Earlier today, President Obama signed an Executive Order directing his administration to take historic steps to make government-held data more accessible to the public and to entrepreneurs and others as fuel for innovation and economic growth.

Here’s what you need to know:

  • The Executive Order declares that information is a valuable resource and strategic asset for the nation.
  • Newly generated government data will be required to be made available in open, machine-readable format by default — enhancing their accessibility and usefulness, and ensuring privacy and security.
  • These executive actions will allow entrepreneurs and companies to take advantage of this information — fueling economic growth in communities across the Nation.

Watch this short video and find out more about today’s announcement:

Learn Linked Data

EUCLID is a EU funded project that will teach practitioners on how to use Linked Data at daily work. The first tutorials have been released including digital book chapters for HTML, iBook, ePUB and Kindle.

euclid-logo_0

Languages for the Web

top_languages_based_on_t-index_2012

The official count of languages that human speak are somewhere around 7000. Does this mean we have to translate all our material, especially our Web page to all languages? According the T-Index 2012 created by translated.net we need 10 languages in order to access a potential of 80%+ of the online sales potential. This is based on the GDP per capita. Adding 5 more languages and you can cover 90% of the online sales market. Taking an approach based on the United Nations then the 6 main languages are English, Chinese, Russian, Arabic, French and Spanish.

Based on the statistics provided by Internet World Stats the following languages will help you to communicate with more than 80% of the population: English, Chinese Mandarin, Spanish, Japanese, Portugese, German, Arabic, French, Russian and Korean.

Believing in Forbes and the analysis of  the top 10 countries for doing business we can create a correlation of selecting the languages. The top 10 of Forbes are: United Staes (10), Germany (9), South Korea (8), Switzerland (7), Australia (6), Austria (5), Israel (4), Sweden (3), Finnland (2), Singapore (1).

Bloomberg made another analysis and provides the top 20 countries for doing business. There are for sure more statistics out there but it seems that understanding English, Chinese, German, Japanese and French will help in building business.

Tope 10 Strategies by Gartner for 2013

The top technology trends impacting information infrastructure in 2013 include (excerpt from the Gartner Report ID http://www.gartner.com/resId=2340315)

Big Data

Gartner defines big data as high-volume, high-velocity and high-variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making. Big data warrants innovative processing solutions for a variety of new and existing data, to provide real business benefits, but processing large volumes or wide varieties of data, remains merely a technological solution, unless it is tied to business goals and objectives. New forms of processing are not necessarily required, nor are new forms of processing always the least expensive solution (least expensive and cost-effective are two different things). The technical ability to process more varieties of data in larger volumes is not the payoff. The most important aspects of big data are the benefits that can be realized by an organization.

Modern Information Infrastructure

IM is a discipline that requires action in many different areas, most of which are not technology specific. Central to success is an enabling technology infrastructure that helps information producers and information consumers organize, share and exchange any type of data and content, anytime, anywhere. This enabling technology infrastructure is what Gartner calls a modern information infrastructure. Because it must support a wide range of information use cases and information types, it is essential that information infrastructure be viewed as strategic, so that a vision to develop it in a cohesive and aligned way over time is possible. Organizations that establish a road map for this type of cohesive, application-independent and information-source-independent set of IM technology capabilities are best placed to achieve long-term enterprise IM (EIM) goals.

Semantic Technologies

Semantic technologies extract meaning from data, ranging from quantitative data and text, to video, voice and images. Many of these techniques have existed for years and are based on advanced statistics, data mining, machine learning and knowledge management. One reason they are garnering more interest is the renewed business requirement for monetizing information as a strategic asset. Even more pressing is the technical need. Increasing volumes, variety and velocity — big data — in IM and business operations, requires semantic technology that makes sense out of data for humans, or automates decisions.

The Logical Data Warehouse

Data warehouse (DW) architecture is undergoing an important evolution, compared with the relative stasis of the previous 25 years. The DW is evolving from competing repository concepts, to include fully enabled data management and information processing platforms. These new warehouses force a complete rethink of how data is manipulated, and where in the architecture each type of processing occurs that supports transformation and integration. It also introduces a governance model that is only loosely coupled with data models and file structures, as opposed to the very tight, physical orientation used before.

NoSQL DBMSs

NoSQL DBMSs — key-value stores, document-style stores, and table-style and graph databases — are designed to support new transaction, interaction and observation use cases involving Web scale, mobile, cloud and clustered environments. Increasing adoption and growing customer demands have opened up a significant gap between commercially supported NoSQL DBMSs and open-source projects that have only community support. The latter remain immature and are used by Web developers for applications that are not mainstream. Commercial products are using their added funding not only to build sales, support and marketing, but also to add enterprise-class features intended to widen adoption and win new business. The growth of the ecosystem will have an impact on broadening adoption. However, awareness is still limited and the leading players remain off the direct sales playing field, slowing their penetration of corporate IT strategic plans. As a result, business impact in 2012 was moderate, but in 2013 is increasing as more organizations investigate and experiment.

In-Memory Computing

In-memory computing is an emerging paradigm, enabling user organizations to develop applications that run advanced queries on very large datasets, or perform complex transactions at least one order of magnitude faster (and in a more scalable way) than when using conventional architectures. In-memory computing opens unprecedented and partially unexplored opportunities for business innovation (for example, via real-time analysis of big data in motion) and cost reduction (for example, through database or mainframe off-loading).

Chief Data Officer and Other Information-Centric Roles

EIM requires dedicated roles and specific organizational structures. Specific roles, such as chief data officer, information manager, information architect and data steward, will be critical for meeting the goals of an EIM program. The fundamental objectives of the roles remain constant: to structure and manage information throughout its life cycle, and to better exploit it for risk reduction, efficiency and competitive advantage. The enterprises that are moving first to create these roles, and to train for them, will be the first to benefit from information exploitation.

Information Stewardship Applications

Governance of data is a people- and process-oriented discipline that forms a key part of any EIM program. The decision rights and authority model that forms governance has to be enforced and operationalized. This means that this technology is needed to help formalize and combine the day-to-day stewardship processes of (business) data stewards into part of their normal work routines. The formation of this specific toolset needs to be closely targeted at the stewardship of primarily structured data. The continued high growth and interest in master data management (MDM) programs is driving much of the interest in this technology, because MDM gives these solutions recent and specific context, which makes them applicable and meaningful to users. However, other initiatives, such as data quality improvement and broadening information governance goals, are also driving demand.

Information Valuation/Infonomics

Information valuation is the process by which relative value or risk is assigned to a given information asset or set of information assets. The question of the value of information has been around for a long time; however, a more formal approach to information valuation is beginning to take hold in leading-edge organizations. When considering how to put information to work for the organization, it is important to not only think about information being like an asset, but also to actually value and treat it as if it were an asset. Any number of established methods for valuing intangibles (for example, market approach, cost approach or income approach) can be used, or organizations can select valuation methods that map to nonfinancial key performance indicators.

Google Inside about 30 trillion web page search

This well presented inside story from Google explains the basic search. Google claims to process 30 trillion web pages and that the results you get inside your browser, tablet or smartphone are generated within 1/8 of a second. A lot of effort is also put into the intuitive user interface. When you search, Google tries to figure out not just what you’re typing into the box, but what you mean. So algorithms for spelling, autocompletion, synonyms, and query understanding jump into action. When Google thinks it knows what you want, it pulls results from those 30 trillion pages and 100 million gigabytes, but it doesn’t just give you what it finds. First, a ranking procedure uses over 200 closely guarded secret factors that look at the freshness of the results, quality of the website, age of the domain, safety and appropriateness of the content, and user context like location, prior searches, Google+ history and connections, and much more. For example the link to the Google Knowledge Graph.

screen-shot-2013-03-01-at-12-39-51-pm

Visualise Open Government Data: A competition

For all that want to compete an demonstrate how they use and visualise Open Government Data I suggest to diff deeper into the competition. Below the text extract that was published on semanticweb.com and links to the article at the Guardien.

(The Guardian, Google, and the Open Knowledge Foundation have launched a new competition to find the best open government data visualization. The announcement states: “Governments around the world are releasing a tidal wave ofopen data – on everything from spending through to crime and health. Now you can compare national, regional and city-wide data from hundreds of locations around the world. But how good is this data? We want to see what you can do with it. What apps and visualisations can you make with this data? We want to see how the data changes the way you see the world. In conjunction with Google and the Open Knowledge Foundation (who will be helping us judge the results), see if you can win the $2,000 prize.”)

Here the link to the Guardien article here