Tuesday, December 30, 2014

#LOC - #Wounded Knee II

At the time I blogged about the restoration of an old photograph take after Wounded Knee. At the time there were many people actively restoring pictures as a group. This is just one of the important restaurations undertaken by Durova.

Now almost five years later, I learned that the Library of Congress included the restored picture with the original. Now that is awesome.

#Wikipedia - She did not study at #Washburn #University

I like this picture. It is all I like about this person who according to the article about her studied at Washburn University. According to the article about the Washburn University school of law, she is a "notable alumni".

The issue is that there is no category for these alumni yet. It is something the Wikipedia editors did not do. So I want to borrow one of the slogans from the picture..
Mourn your sins
The others fit the person in the picture more than anything else. It is certainly colourful.

Monday, December 29, 2014

#Wikidata - 2000 #categories with a #list

When people die and it becomes known in Wikipedia, often more can be said about that person than just his date of death. They may have studied, held an office, been a member of a sports team or a political party. Much of this information is registered in categories.

Thanks to Autolist2 it is easy to add all that information to Wikidata. The best bit is that we can indicate what conditions the articles in the category have to fulfill in Wikidata itself. It is picked up by the Reasonator and this shows in these 2000 categories that have had the treatment so far.

So check out these categories, click on the Reasonator icon and be amazed about the number of items that have been enriched slowly but surely.

PS You can do this as well :)

Sunday, December 28, 2014

#Wikidata - an #argument about a #list is not worth it

Eternal arguments are not worth it. From a Wikidata point of view, lists only represent articles on a Wikimedia project. They have little to no value on their own. The argument has it that you can either re-use lists to represent subjects OR you create items for the subject separately.

Sorting this out is a mess that does not bother me. I can be bothered to implement either of those two options.

The President of Sardinia is an item I created because I do not see the point in arguing the point yet again. I hope someone will revert me BUT first ends the argument in a conclusive manner.

Thursday, December 25, 2014

#Wikidata - a busman's holiday

I am on a break. The people who died do not inspire me to find that they studied, held an office, or were a professor at a university or campus. I do not know who there family was, I do not know if they participated in an event. It died.

What to do..

I challenged students to point out the successes of their university or campus. To be brutally honest, there are already enough of them from German and American schools. I do not really care about them. This is different for people from countries like India, Russia, China. From a numeric perspective there are too few of them known in Wikidata.

So I am adding people who studied in India. It beats rehashing old discussions. It beats convincing Wikidata people that my time is as precious as theirs. It beats being upset about the lack of experience by some of them. It beats eating too much.

Season greetings

Every year there is a SignWriting card like this one. It is send to the people subscribed to the SignWriting mailinglist. They are created by the list members themselves, this time by Stefan and Elfriede Wöhrmann.

There is no card they can buy from Hallmark but the need for cards for the seasons greetings is as high as for anyone else. Most years I put the card on my blog and this year is no exception. Each year I hope for at least one Wikipedia in a sign language. Maybe in 2015. This is what Steven Slevinsky has to say about it. It explains the challenges and the hope for the future really well.
I believe that 2015 will be the year of the sign language Wikipedia.  I believe the ASL Wikipedia activity will increase and other sign language Wikipedia projects will start. 
The new TrueType fonts are a great step forward: pages are 30 times smaller and SignWriting can be displayed without an internet connection.  Each page is reduced to a single connection, elimination a unique connection for each sign/symbol on the page. 
The new editors based on the TrueType fonts are nearing completion.  This will increase the spread of written sign language all over the Internet. 
Written sign language will also spread because of the inclusion of the SignWriting symbols in Unicode 8. 
So 2015 will be an amazing year for SignWriting.  We'll have TrueType fonts, new editors, Unicode, and Wikipedia.  A great combination for success.

Wednesday, December 24, 2014

#Wikidata - #Death is taking a break

It has been my pleasure to kill of many of the people who died in 2014 in Wikidata. It has been a struggle to keep up; they never stop. I use a tool to find most of them. It died on me and now I am taking a break.

The tool was developed by Magnus and like all the other tools of him it serves a purpose, it is even important, there is no alternative. When it breaks, there is the convenient excuse: "it is not official".

It is convenient and it ignores how important these tools are. Both the Wikimedia foundation and the chapters know how important these tools are and find it inconvenient when this is pointed out to them. Do they have a responsibility in this; is it important that the GLAMS are supported in their need for statistics?

This self absorption means that only their own stuff is deemed worthwhile enough for support never mind that it does not have the oomph to deliver, never mind that their alternative is only a mirage in a developers mind.

In the end it takes a Magnus to support many of the tools that make a difference in the real world. Magnus is taking a needed break. I wish him health, joy and happiness for the coming year. He is my Wikimedian of the year.

Friday, December 19, 2014

#Wikimedia Foundation - does not get one € of mine

The WMF Fundraiser will be a success. I hope it will be and when you feel like it, please do donate. You can donate by creditcard, by money transfer, by paypal, Please do.

However, I will not make a donation except for the donation of my time. Paying money is comparatively superbly organised in Europe. As a person you can transfer money without cost within Europe. All it takes is knowledge of the IBAN number to transfer to and the name of the entity you transfer money to. Easy peasy.

When you have a website with customers in the Netherlands, you can use a system called iDEAL it has the virtue of being cheap. Wikimedia Foundation does not support cheap.

Paypal and credit cards cost money, they deduce money from the amount given.

In Great Britain an organisation collects money for the WMF for a fee. At the same time there is a UK chapter who could easily organise it for the WMF and in the process hone their skills in fundraising, a skill the WMF wants it to develop anyway.

I refuse to pay these additional costs.

While I hope the WMF collects all the money it wants in the USA, It effectively hands over ownership to the USA and its way of working. There is little consideration for the rest of the world because if there was, they would actively welcome more monetary contributions and partner with its chapters in raising the funds needed for our movement and not only for the Wikimedia Foundation.

Thursday, December 18, 2014

#Google - Would you use your #trickery for us?

After the shock and awe of yesterday's announcement, There has been time to think. Google showed a real interest in Wikidata. It created a new tool to help improve the quality of its data. But the real expertise of Google is in determining the probability of facts. It is part and parcel of its ranking algorithms.

It would be as awesome when Google would indicate those statements it deems to have a less than even chance of being true. The combination of such a list and the new tool would make the efforts of the people seeking sources all the more relevant. When statements are debunked, it has a potential quality effect to all the associated Wikimedia projects. Given that it is probable that most statements are fine, it makes for more concentrated effort and consequently its effects will be noticed.

While we are on this line of thought, given the data of Freebase, Google could indicate based on its algorithms how probable its sets of data are. Everything that is highly likely should be a candidate for import in Wikidata. The other reason for importing data into Wikidata anyway is that it is an invitation to all the Freebasers to join our ranks, increase our expertise and together be awesome.

Wednesday, December 17, 2014

#Google - What it does with #Freebase is beyond awesome

In an e-mail Denny Vrandečić announces an astounding bit of news. It effectively says that Wikidata can have all its data if it wants it.

It then goes on saying that it is not expected for Wikidata to accept all this data and follows with an announcement for a tool that is to source data. This news is best read as it was announced..
Thank you Google!
Freebase was launched to be a “Wikipedia for structured data”, because in 2007 there was no such project. But now we do have Wikidata, and Wikidata and its community is developing very fast. Today, the goals of Freebase might be better served by supporting Wikidata [1]. 
Freebase has seen a huge amount of effort go into it since it went public in 2007. It makes a lot of sense to make the results of this work available to Wikidata. But knowing Wikidata and its community a bit, it is obvious that we can not and should not simply upload Freebase data to Wikidata: Wikidata would prefer the data to be referenced to external, primary sources. 
In order to do so, Google will soon start to work on an Open Source tool which will run on Wikimedia labs and which will allow Wikidata contributors to find references for a statement and then upload the statement and the reference to Wikidata. We will release several sets of Freebase data ready for consumption by this tool under a CC0 license. This tool should also work for statements already in Wikidata without sufficient references, or for other datasets, like DBpedia and other machine extraction efforts, etc. To make sure we get it right, we invite you to participate in the design and development of this tool here: 
 https://www.wikidata.org/wiki/Wikidata:Primary sources tool 
I hope you are as excited as I am about this project, and I hope that you will join me in making this a reality. I am looking forward to your contributions!  
[1] https://plus.sandbox.google.com/109936836907132434202/posts/bu3z2wVqcQc

Denny Vrandečić via lists.wikimedia.org 

Tuesday, December 16, 2014

#Wikidata - WDQ with load balancing

The #Wikipedia app on the #mobile is to give you everything that is near you. The question is, should it be based on Wikipedia or on Wikidata data. In order for software to find geo references, "magic words" need to be employed in Wikipedia. These same magic words can be used to harvest the information for Wikidata.

So what are the benefits of using Wikidata over Wikipedia with magic words .. Most importantly, there is only one Wikidata and there are 280+ Wikipedias. Everyone seeking information about subjects nearby is as entitled to great information as anyone else.

Wikidata does not have official query functionality. But it does have WDQ. Magnus and Yuvi are finishing the implementation of load balancing for WDQ. So the question is not can we serve geo coordinates from Wikidata but can we afford to let this opportunity slip us by.

PS There has been no evaluation of WDQ yet by WMF engineers.. Why not?

Sunday, December 14, 2014

#Wikipedia - The Time Jumpers

Recently one of the Time Jumpers, Dawn Sears, died. She was married to one of the other band members; Kenny Sears he is the one playing the fiddle to the right.

According to her Wikipedia article, she is indeed married to a Kenny Sears. This is however a redirect to someone else. The husband is known on the Time Jumpers article as Kenny Sears (fiddler), it is a red link.

It is easy enough to add an item for Mr Sears in Wikidata and link him to both the Time Jumpers and to his wife. It would be good when the Wikipedia red link could be linked to Wikidata. When red links are linked to Wikidata, it is possible to relate them to existing items. In this way information is available that can be used as information for a possible article. To bring this article to an editor it just needs to be presented on the red link. That is easy enough.

Friday, December 12, 2014

#Wikidata - Hans Wallat, conductor

According to the English Wikipedia Mr Wallat was awarded the Musikpreis der Stadt Duisburg. It is a "red link". The German Wikipedia has an article about this award. It lists all the winners of this award.

Using the Linked Items tool, it is trivially easy to add statements for the winners of this award. For all but three; they are red links on the German Wikipedia, it is easy enough to add the items for them.

Arguably they are notable because they complete the list of all the winners for this award. Adding dates is icing on the cake..

On the English Wikipedia it is nice to link to the Reasonator for the award. It links to a Reasonator page for the awardees. It is how we can share in the sum of all available knowledge.

Thursday, December 11, 2014

#Wikipedia - #redirects are a one trick pony

Wikipedia and Wikipedians have grown up with the "benefits" of redirects. It is why an article is also known by a different name. In Wikidata they can be labels.

Another use is to link a name to somewhere in an article where they are mentioned. When this finds its way in Wikidata it is assumed that proper information is available on the subject in that Wikipedia article.

Wrong. When you read a Wikipedia article, it is full of all kinds of references from the subject. All of these references are also available in Reasonator in the concept cloud. Many of the references are available in statements and they in turn are available on the referred to qualifiers as well.

What something like Reasonator could do is provide proper information for all the subjects that do not have an article and refer to articles when they exist. It currently links to other Reasonator pages but it is not hard at all to configure this to link to Wikipedia articles in the "current" language. This would be a redirect on steroids.

Monday, December 08, 2014

#Wikipedia- The numbers are what II ?

Numbers, statistics have a purpose. Their typical use is to have manager types consider how things are moving. Even though they have infinite wisdom there is not much that they can do when all numbers do is show trends.

The funny thing is that only the numbers that are collected are reflected in these trends. Wikidata for instance attracts no readers and consequently it may not be considered as a source of attention. It is a fallacy and it does not motivate people interested in Wikidata.

There are so many lists that could motivate people. Articles that need writing, differences in data between Wikipedias. The wonderful thing is that they all bring a sense of purpose and are an inspiration to improve both quality and quantity.

My favourite list is the list of zombies for 2014. Currently there are 400+ zombies that need to be killed of. It motivates because I know what to do and it is a convenient way to find categories of information that can easily be imported in Wikidata.

Statistics, numbers can motivate people to be more effective. That is how you influence the numbers these manager types look at.

Sunday, December 07, 2014

#Wikipedia- The numbers are what ?

So the numbers are flat.. You want initiatives that help provide us with relevance.. Ok, how about this scenario:

A reader queries Wikipedia for a subject and does not find it. Many more people query for this subject and it becomes the most wanted subject without an article. An editor writes the article and it proves popular. It becomes the most read new article in the next month.

In this way:
  • we give our search statistics a purpose
  • we indicate what subjects our readers want articles about
  • we celebrate the most read new articles and their editors
  • we advertise that we ask our editors to write articles people are looking for
  • we can do this for every Wikipedia in every language
Yes, we can provide search results from Wikidata as well. When people make use of this, it counts as a not found instance. In the mean time we did provide information that is available to us.

#Wikidata - KCG College of Technology

Thank you Google
A friend of mine studied at one of the engineering colleges in India. He now works for the Wikimedia Foundation and, he rocks.

There are articles for many of these colleges and they all could do with some loving attention from the people who study or studied there.

How about adorning the items in Wikidata with the name in an Indian script. How about adding an image. It will show in Reasonator on all the people who studied or taught there..

#Wikipedia - reaching out for #ebola

When people are not well informed about ebola, they panic. It is therefore ever so important to get the message out and get the message right. The right information is particularly important for the people who live with ebola, who see the effects first hand.

They live in Africa, in countries like Guinea, Sierra Leone and Liberia. A study shows that the most used source for information in these countries about ebola is Wikipedia.

For these people many of the things we take for granted are a dream. When you know about Gapminder, you know things are improving and not as bleak as they seem. People are living longer, getting educated and infrastructure is improving. It is why Wikipedia also thanks to Wikipedia Zero is reaching these parts others do not reach as effectively.

Wikipedia is effective thanks to the dedication of its volunteers, particularly those who ensure the quality of medical articles. Wikipedia comes cheap. It is the best investment in bringing knowledge to a world that is sometimes desperate for great basic and actionable information.

Saturday, December 06, 2014

#Wikidata - #Bangladesh #University of #Engineering and Technology

The Bangladesh University of Engineering and Technology currently knows 16 people who studied there. Two of them were in a category that had not yet been created.  In theory 16 articles are waiting for them to be added to this category. In addition, someone may want to categorise the category..

The good news in all this:
  • Wikidata can be used to populate Wikipedia categories
  • people are actively adding information in Wikidata first
  • by adding students and faculty in Wikidata, people are connected and consequently the universities and colleges get exposure
  • Maybe, my call to the college boys and girls is heard by some of them

Friday, December 05, 2014

#Wikipedia - #Russia and the #USA

Yesterday I wondered when Russia will become the champion users of Wikipedia. Today I noticed a presentation with the numbers.. Traffic to Wikipedia from the USA is down by -8.6% while the annual growth of Wikipedia traffic from Russia is +10.3%. A difference of almost 20%.

Thursday, December 04, 2014

#Wikipedia - #Russia #rules OK

English Wikipedia is still the biggest. The Russian Wikipedia however is growing much faster. It replaced the Germans, the Spanish and the Japanese to become the second biggest Wikipedia. When it continues to grow like this it will overtake English Wikipedia.

There is one question in the back of my head... When you consider only the traffic from the USA for the English Wikipedia, and Russia for the Russian Wikipedia, how will they compare ? How long will it take for Russia to overtake the USA as the champions of Wikipedia.

Wednesday, December 03, 2014

#Wikimedia and its content delivery

The #vision is "share in the sum of all knowledge" and all our projects contain a wealth of information. The infra structure, the software that brings this information is at this time very much centred on the two ends of delivery. It is in the data centres and it is in the last mile.

The effort of the last mile is the Wikipedia Zero project; This wonderful project brings information at no cost to the mobile phones of people who use the services of cooperating mobile operators. The content they use comes from the WMF datacentres in the USA and, that is suboptimal.

It is suboptimal because it takes time to get that data from the first world data centres of the WMF. It is suboptimal because the pipes are often oversubscribed. The consequence is that the service is not as good as it easily could be.

With a "content delivery network", this information is kept locally and it is only the updates that have to come and go all the way to the central servers in the USA.  This is a lot less data for those pipes, it is a lot cheaper to operate for our cooperating mobile operators in Wikipedia Zero and the quality of service will improve a lot.

There are no technical reasons why the WMF cannot do this. All that I see is personal preferences and possibly some legal issues. The WMF has the experience because of its servers in Amsterdam. It should be relatively easy to mimic this at the sites of our cooperating mobile operators. Alternatively we could pay commercial rates and do it ourselves.

A lot of effort is invested in making Wikipedia, MediaWiki perform better. This is another obvious improvement that will make a big difference not only to our Wikipedia Zero users but for everyone who uses our projects outside of the USA and much of Europe.

Sunday, November 30, 2014

#Wikimedia & diversity - Mary Hinkson

Mrs Hinkson died November 26, 2014. She was for a long time a member of the Martha Graham Dance Company. She is considered to have been very influential; she was awarded the Martha Hill Lifetime Achievement Award. She studied on a university, she taught at a university and, according to the Wikipedia article, she influenced both ballet and dance.

The reason why Wikidata knows about it is because I took the time to add them. In this way I can point to the failings of my approach to adding data.

When an article does not have relevant categories, typically I will not add associated information. Mrs Hinkson is highly notable and the only category that adds information for her is: "American female dancers". 

There are many things I will not state. Nationality gets you in conflict in too many ways, so does race, religion. The consequence is that Wikidata is fairly uninformative about this. From a diversity point of view, it is not that great.

I think when race, religion and nationality play a big role in an article, the Wikipedia categories may not be all that inclusive. To find if this is true, takes some research.. The results I am looking forward to.

#Wikimedia - #Wikidata; a recurring subject

The last Foundation Metrics meeting is kinda interesting, particularly when you are interested in #Wikidata. In his part of the meeting Erik considers the implications of Wikidata and wonders what it takes to help it lift off.

Erik wants us to change the world. Now that is a big statement. It can be done. It takes big thinking, maybe even bigger thinking or maybe no thinking at all. In the vision Erik presented, it is all about data and leveraging the data for instance in info-boxes.

Have a closer look at Reasonator, to its statistics and to these specific statistics (it takes a long time to load). What you find is probably the most easy and effective thing to do. It is allowing people to add labels in their own language. Labels leverage all Wikidata statements for a language. With those labels you can disambiguate effectively in a search. Try it in Reasonator. Now change the language and notice that effectively the automated descriptions are still there. Now do the same with search in Wikidata.. See ?

At Wikimania there was a heated discussion about the need for descriptions . The only half baked argument to keep the current descriptions was that people outside expect them. Lets not be strangers.

Friday, November 28, 2014

#Wikimedia #Labs - my #stake is rare, bruised

I have a stake in Wikimedia Labs. I rely on it. I am not the only one. Wikimedia chapters rely on it; they need it for many of their activities. Glam is one area vital to them that relies on Labs.

For the Wikimedia Foundation, Labs is second tier. They have a few people dedicated to Labs. Good people, well intentioned people but what they offer is not production quality. They cannot for several reasons. There are not enough resources for them to do what is needed.

Chapters are second class citizens as well, The fact that Labs is vital to achieve their aims so far did not make a noticeable difference. In my opinion it is not only the Wikimedia Foundation who can and should make a difference. It is the chapters themselves as well.

I urge the chapters to invest in Wikimedia Labs.. It is BOTH the responsibility of the WMF and the chapters to provide adequate support. During business hours operational support should be available. Stakeholders in both WMF projects and chapter projects rely on adequate service.

Today is black friday in the USA. Yesterday was Thanksgiving. When all staff celebrate their turkey we are left to fend with even less.

Thursday, November 27, 2014

#Wikidata - #today, #tomorrow

In #Reasonator you can check out dates. The first people of today are known to have died. Who will die tomorrow is only known to God. All we have to do is wait and see.

When we are all done with Wikipedia, all the living people will have died, Hmmm, that is a long time coming. First we have to kill of the ones not known to be dead yet.

#Wikimedia - #empower the #chapters

It has been #budget time for the Wikimedia chapters. As it is centrally decided what chapters "get" and as the finances of the main organisation are not considered under equal terms, they are secondary by definition.

To prove this, a few points:
  • The WMF director defined criteria for quality for the chapters
  • The chapters are barred from involvement in the annual WMF fundraising
  • The chapters rely on funding from the WMF AND the metrics of success do not exclude the cost of WMF related admin
  • The chapters can not compete for the resources the WMF assumes its own for new endeavours
  • The chapters are not represented at the office of the WMF
Many of these points have a long history and are sacred cows to some. My point is very much that there are many small things that can make the distinction less stark. It starts with an awareness that chapters support open culture and a community in a country. They would benefit from shared resources that can be made available after minor modifications of what is already there.  Our movement is not only English Wikipedia and does not only have an USA or alternatively a world view.

Wednesday, November 26, 2014

#Wikipedia - Hey, #College Boy II

Remember? At this time, of the sum of all the 195085 notable people with an alma mater, 158649 are men and 29059 are  women for 7377 no gender is known.

They include all the boys and girls of *your* university. Take the University of Virginia for instance. When I first looked at it, there were only 142 alumni. The category knew about at least 815 more of them. They are being added as well, software permitting.

This query has all the UoV alumni. These are all the men and these are all the women.. Maybe this is a good time to write Wikipedia articles, identify articles to Wikidata about the female UoV alumni.

Sunday, November 23, 2014

#Commons - the Como Cathedral

The Como Cathedral is a cathedral in Como, Italy. In it you will find works of art that are represented in Commons. This link to the works of art is established through an "institution template". It was easy to link the Como Cathedral by adding this: "| wikidata    = Q1101730" in the template.

At the moment there are 1149 templates waiting to be linked to Wikidata. With this link established, it is possible to either populate these templates with information from Wikidata or populate the templates with information from Wikidata.

It is a precursor for easily finding files in Commons that are linked to institutions. Many of them are GLAM partners of us and it is yet another way of establishing how important they are to us.

#Wikimedia #chapters, the data

Guess what, Wikimedia chapters are linked to many other organisations. These organisations are known in Wikidata and now the chapters are known as well.

For many GLAM partners we have all kinds of statistics. We could link the partners to the chapters that they are connected to.. It is the basis for information on the usefulness of the chapters.

#Wikimedia - the point of #collecting #data?

If #Wikidata is one thing, it is useful. It was useful from the start by including all the Wikipedia articles who are linked to articles in other languages. In the next phase statements were added and more and more articles were added that did not link to other articles. They were needed because they were a part in the expression of a statement. Then for all articles Wikidata items were created and still more items were created because they were needed in the statement of expressions.

There is a point to linking the articles. It enables people to read about the same subject in other languages. There is a point to adding statements to items; it enables articles to be linked to whatever. This combination enables us to report on Wikipedia in ways not yet done.

If you want to know about the gender division; currently these are the men, the women in all our projects. Since June 2014 90,850 more items became known to be women and 445,240 as men. Interesting but this information is not in a format that is "academic" or useful.  Having this information in a bar chart with regular intervals gives more insight in what we have. Using old dumps for this is one solution. Breaking the information up per Wikipedia provides even more granular information.

Providing statistics in this way is good for several reasons:
  • it is public and verifiable information
  • it stimulates people to add statements about gender
  • it stimulates people to write about men and women
  • it makes it obvious that it is Wikidata where we know these things

Friday, November 21, 2014

#Wikimedia - first #standardisation, then #specialisation

The hardware and software used by the Wikimedia Foundation is increasingly standardised. It uses the same software and the configuration is centrally maintained. Good news; it makes for a stable platform. A stable platform allows us to share in "the sum of all available knowledge".

With this process well under way, special attention can be given to special projects. It has probably escaped your attention that the WMF now has a "Services group". They are the engineers that support the standalone software components that often run on their own machines and have very specific jobs, such as "generate a PDF from this article".

Wonderful news. When it did not escape your attention, did you notice that Stas Malyshev is getting up to speed on the Wikidata Query Service[1], figuring out what we need to do to make it suitable for widespread deployment of WikiGrok[2])?

Effectively it means that Magnus's query tool will be used by an updated version of the Games [3]. Now is that not sweet; Wikidata data being USED to leverage our community to improve Wikidata even more.
  1. https://wdq.wmflabs.org/
  2. http://www.mediawiki.org/wiki/Extension:MobileFrontend/WikiGrokhttps://wdq.wmflabs.org/
  3. https://tools.wmflabs.org/wikidata-game/

Thursday, November 20, 2014

#Wikimedia & Project #Gutenberg - the sum of all knowledge

"To share in the sum of all knowledge" is the vision of the Wikimedia Foundation. The Swiss chapter does understand this really well. It has adopted Kiwix, an off line reader for content that is published in the ZIM format.

Project Gutenberg is a well established organisation dedicated to the digitisation of books. Its catalogue of 50.000 public domain books is now available to everybody, everywhere and offline as well.

Thanks to a hackathon, all books are now available in the ZIM format, you can search in all the books at the same time. The best news is that not only has this work been done for a first time, it is build in such a way that it can be easily repeated.

Future deployments may include all the books of Wikisource, books from other sources and even copyrighted works as well. The point of Kiwix is that it is an enabler, it allows for the dissemination of knowledge and to achieve THAT is what our aim is.

Congratulations to the Swiss Wikimedia chapter for providing the sustained support of this valuable project.

#Wikidata - C. Rudhraiya; #filmdirector from #India

Mr Rudhraiya studied at the Adyar Film Institute and, he recently passed away. According to some, he brought fame to his alma mater. Mr Rudhraiya also studied at the St. Joseph's College, Tiruchirappalli.

The point is not so much that Mr Rudhraiya was a studied man, it is more that we know this about him. As more information like this is known about "living persons", they get a better representation in Wikidata.

At this time only two movies of Mr Rudghraiya are known to be directed by him. There must be many more. It is possible to know all the people he worked with by connecting him through his movies, With more data this information becomes more complete.

Wednesday, November 19, 2014

#Wikipedia - Nel Garritsen, a Dutch swimmer

Mrs Garritsen is one of only a few people who are known to have died and has an article in the Dutch Wikipedia. In that article it is currently not known that she died. We know it in Wikidata courtesy of the article in the English Wikipedia.

Every Wikipedia do things their own way. By not having categories for people who died in a given year, there is no way to know about the recent deaths known in the Dutch Wikipedia. It is also not possible to indicate to the Dutch Wikipedians what people are known to be dead in other sources.

Mechanisms like this help to ensure that proper information is available for "living people". Arguably, maintaining categories with the people who died in a given year are a valuable instrument in an implementation of "BLP".

Tuesday, November 18, 2014

#Wikidata - Carl Sanders, is not the 74th "List of Governors of #Georgia".

It is said that the community is always right. It also has a short term memory and its consensus is not necessarily what you hope for.

Take Mr Sanders, he died recently and it was indicated that he was a "List of Governors of Georgia". It is an old argument that is the result of some bad practice at Wikipedia. The Wikipedia article includes mainly a list and consequently it is to be called a list. There is no article about the subject itself and hey "it must be a list in Wikidata as well".

It is simple to fix the situation for the governor of Georgia. All articles are lists, there is no Wikipedia that has both a list article and an article so I had the item identify the subject.

Using the category I added many of the "missing" governors, there were only 15 humans known to be governor of Georgia. I made all of them a politician and an US-American.

The community has every right to rehash old arguments. I just follow the old consensus and wait for the dust to settle yet again.

Sunday, November 16, 2014

#Wikimedia NL - my #Wikidata presentation - #WCN2014

The presentation I gave at the 2014 Dutch conference in Utrecht went well. Sadly, for whatever reason I found that it is not yet on Commons. That can be remedied.

When I present, the slides include the main points so when people doze off, they can always find what it was all about. This presentation is very much my view on Wikidata. I presented in Dutch and the slides are in English so that it can be easily re-used.

The points I made are:
  • Knowing about Wikidata and its development is best understood thanks to the stats
  • Appreciating the information included is best done through the Reasonator
  • Wonderful tools exist that are sadly NOT part of plain vanilla Wikidata
  • Why and how I make so many edits ... the method in my madness
  • The Dutch Wikipedia COULD activate Wikidata search.. to share in the sum of all available knowledge
  • Much knowledge is not known to the Dutch Wikipedia
  • Wikidata already knows about much meta data on Commons thanks to the Creator templates

Saturday, November 15, 2014

#Wikidata - Jens Brugge; a judge from Norway

Mr Brugge, a high court judge from Norway died. According to the article about him, his lineage is illustrious. Many generations in the Brugge family were quite notable.  It can be seen in GeneaWiki2 and, it can be shown inline or in a separate window from the Reasonator.

There is an increasing amount of genealogical information available in Wikidata. The value of all this data is not in having it, it is in using it. At this time 29,337 people are known to have a father and 13,336 people are known to have a mother. Obviously, these numbers will only increase and become more complete. Would it not be wonderful to share this information in Wikipedia articles as well?

Friday, November 14, 2014

#Wikidata - Thanks for the Book Award

It is very rewarding to read a good book and, it is great when good books find their way to you. There are many literary awards known in Wikidata and the "Thanks for the Book Award" is one of many.

This award has Wikipedia articles on eight Wikipedias. It is a Finnish award and every year there is a new winner. This year it was Pauliina Rauhala for her book "Taivaslaulu".

Most of the Wikipedia articles have not been maintained for quite some time. They are not aware of Mrs Rauhala for instance.

To improve on the Wikipedia articles, all it takes is a mechanism to highlight when a new award is given in a year, The data can be found in Wikidata and, as you can see, in Reasonator we have the timelines showing the winners in order.

As we have the data, we can query for this years awards. With hidden queries, we can exclude those articles that are known to have a winner. It is not hard, it is motivating to share in the sum of all available knowledge.

Tuesday, November 11, 2014

#Wikidata - Annette Polly Williams

As the longest-serving woman in Wisconsin's Legislature, Mrs Williams deserves to be recognised. In her honour, 1752 members of that legislature will be recognised as such and, they will be known as politicians.

Mrs Williams was an advocate of education for all kids. It is one of the things a Wikipedia articles is good for. It is not obvious how to indicate this in Wikidata.

It is a bit strange to find that many American politicians do not have a picture to illustrate their articles. Given the absurd amounts of money involved, providing a few pictures of politicians would be a cheap gesture that would be appreciated.

Thursday, November 06, 2014

#Wikipedia - Now in #Maithili

It is a happy occasion when a new Wikipedia is created. Today we may welcome the Maithili Wikipedia. The website has been created and all the content that is currently still in the Incubator needs to be migrated.

I wish the Maithili community well; I hope they will share with us in the sum of all available knowledge.

Wednesday, November 05, 2014

#Commons - Adolphe Berty

Mr Berty was a French author, antiquarian and archeologist. There is no Wikimedia article about him, there is no Wikidata item for him. There is material on Commons he created. There are links to several external sources making Mr Berty notable enough for them as well.

Amir ran his bot so all the people with Wikipedia articles are now linked to Wikidata as well. People like Mr Berty just need to be created. Really funny is the realisation that many Wikimedians have or should have their own Creator template. As a consequence they are notable for Wikidata. If not now, certainly when Commons is wikidatified.

It is fun thinking about the implications of the wikidatification.

#Wikidata is the #OpenData winner

Wikidata won first place in the category "Publisher" at the OpenData awards..

It is so well deserved to find both Magnus and Lydia share the limelight. I could not be more pleased.

Monday, November 03, 2014

#Wikipedia - Hey, #College Boy !!

You know the answer for this question: "What can Wikipedia do for you", A more interesting question is: "What can you do for Wikipedia".

So you are on this wonderful college where people who changed the world have been educated right?  Well, never mind what your college says, it is Wikipedia and Wikidata where you find public data about graduates that are considered notable enough.

You may find them in articles, in categories or in none of the above in combination with you college. Wikidata is another place where you find them as well.

Now the questions you might want answer to are:
  • how many graduates are notable enough for one or more Wikipedia articles
  • how many graduates do we know
  • what are those graduates also known for
  • what are the most linked to statements for your graduates
College boy, and girl, you are getting an education. This challenge seems trivial, how will you show what the Wikiverse knows about the people associated with your college..

Sunday, November 02, 2014

#Wikidata - #Wikipedia categories

Wikidata knows about Wikipedia categories. Currently there are items for 2,406,128 categories. Many of those items refer to categories in multiple Wikipedias. One random example is item Q8884100, it refers to categories in 10 Wikipedias. All of them categorize people who studied at the university of Notre Dame. Many of them know about people only known in that Wikipedia. In addition to this there may be articles that are not categorised or references to articles that are not in one of the 10 categories known at this time.
When multiple categorised are "harvested" in Wikidata, Wikidata knows about more items than any of the individual categories. This enables the use of the data in new ways.
  • suggest categories based on the presence of statements in the Wikidata category
  • suggest statement when an article is included in a category
  • include red links in a category when a Wikipedia does not have articles.
Before such functionality will become available, certain tipping points will need to be reached. For instance enough categories need to be harvested in this way and, these categories have to be identifiable for the information they include.

The category "University of Notre Dame alumni" indicates that it is a list of humans AND, for them the statement "alma mater" "University of Notre Dame" has to be made. For over 1450 categories that are about humans similar information exists. Every day more categories are added.

I really wonder what it takes to reach the tipping points that will bring more application of the Wikidata data to for instance Wikipedia.

Wednesday, October 29, 2014

#Wikimedia - Men at work; preparing a #presentation IV - #WCN2014

The Dutch community has one question to answer: what to do with available information in Dutch? How will we make it available. Currently there are 3,054,955 items [1] with labels and there are 1,890,905 items [1] that link to the Dutch Wikipedia. It follows that 62% of the in Wikidata known items do not have an article in Dutch.

This is a substantial amount of information that can be presented in Dutch. Similar numbers can be presented for any language; for English it is 39% and for German 121%..

Arguably, these items fulfill notability requirements somewhere. Arguably the Swedes have demonstrated that having more information available revitalised their community. Arguably, allowing for search results from Wikidata is an easy first step towards opening up all our available knowledge.

[1] these links take a few minutes to load; they provide real time information

Tuesday, October 28, 2014

#Wikidata - #algorithm for updating labels

Amir is the #pywikibot guru; he runs dexbot and it is the only bot with more than 20.000.000 edits. Amir regularly tinkers with the routines that he uses. Sometimes he gets better performance, sometimes he gets a better result.

The algorithm for adding labels has changed several times and, the result of the latest change can be seen in the statistics below. You may notice several spikes, the last one is captured in the last dump; it resulted in many more labels for items where already one label existed.
It is people like Amir qho make a real difference. One bot request of his for Commons will help the Commoners see that Wikidata knows about the people mentioned in the Creator templates. Jobs like this are essential when the wikidatification of mediafiles is to succeed.

#Wikimedia - Men at work; preparing a #presentation III - #WCN2014

The bane of every live demonstration is when the software just does not work. My intention is to show #Wikidata in action. Demonstrate the Reasonator and AutoList2. When the experience of the last few weeks is anything to go by, I have a 50% chance of a reasonable result on the day.

There are many factors that can play up. Time outs at Wikidata are no exception at the moment and when Wikidata does not play ball, everything downstream from it suffers as a consequence. It means that I may not have a recent list of recent deaths because ToolScript does not function.

AutoList2, relies on WIDaR. It relies on being able to contact Wikidata reliably. Without this, AutoList2 does not run.

The subject of my presentation is firmly solution oriented. I can always fall back on screenshots. That feels like cheating.