I have the oppportunity recently to learn this AI solution. I was watching some video materials about IBM Watson in December-and then I thought that it was tremendous technology. Just watch this marketing video:
It looks so easy and having lots of possibilities…
Then at work I had a chance to play with IBM Watson Explorer. What I have to admit so far: learning materials are very poor, too little examples, lots of marketing content. However I found some excellent materials, which I would like to share:
- Excelent materials for students describing all the suite:
2. Very good blogs:
Short description from my perspective of this product so far:
It consist of 3 separate modules:
- Watson Explorer Foundations – it consists of:
- IBM Watson Explorer – used for data acquisition comming from many sources as: text documents, Excell sheets, presentations, URLs, REST API, databases. This engine then converts this data into VXML (special kind of XML), enriches this data with annotations or sentiment analysis or using synonym or thesaurus dictionaries extends possible queries. Afterwards data is sent to indexer, where proper columns are being indexed similiar to database and put into RAM. Everything lands in Search Engine, where you can define so called „facets”. This „facets” represents data which should be groupped – exactly like in database functions: „groupby” or more sophisticated analytics functions. It is created as cgi application using Vivisimo engine. Everything works using Java server (jetty) for jobs serving each document collection + apache(as frontend). This module requires quite amount of RAM memory (at least 64 GB) and lots of free space on disk.
- IBM Watson AppBuilder – uses data from previous module and enables creation of so called: „360 overview applications”. Here is better description of it: https://books.google.pl/books/about/Building_360_Degree_Information_Applicat.html?id=fq6jBAAAQBAJ&redir_esc=y This module is written entirely in Java and Ruby, runs on Websphere Liberty webserver. All configuration data is kept in Zookeeper including: layout of application, configuration of entities, endpoints and their source code. Unfortunately with this part I had quite big troubles, when I discovered that password for backend attached to app was wrong and all application stopped working. As this password was stored as encrypted string, embedded in XML configuration in Zookeeper, it was superhard to change it. The only way was to recreate this app from scratch and import all data from backup-then I could change the password…Another difficulty was fact that it was designed to create only one app at the same time. Fortunately we found a way to omit it.
- IBM Watson Analytics – this module seems for me the most promising as it enables the most interesting part of text analysis, using analytical functions and enables to get real insights. Here is better description of it: https://www.redbooks.ibm.com/redbooks/pdfs/sg247877.pdf This part is written entirely in Java, but requires standalone program called „Watson Analytics Content Studio” to create projects used in this module.
So far I created one collection, which crawled URL and retrieved data from CVS file. It was quite easy, however it took me a while to understand which converter I should choose and which columns should be indexed. I have created also simple application in AppBuilder – creating entites was supereasy, creating endpoints -hmmm… as I am not so familiar with Ruby, it took me a while to figure out how to formulate proper „query” for my data. Adding widgets to my layout – with default ones – quite easy. I haven’t created so far custom widget… Adding natural language processing – moderate hard.
Just now my simple application doesn’t show a lot – but as for the first time I think it wasn’t so bad.
What I would like to have in this suite:
- integration with repository
- easy editor for data kept in Zookeeper
- proper help to all of crawling/converting options
- in AppBuilder – indicator of nonworking endpoints
- super easy synchronisation between dev and prod environments