Tag Archives: WIN

writeslike.us: Wins and Fails

Wins:
➢ Getting information such as institution names/URLs from Wikipedia, and widespread use of available web services in general
➢ Extracting names from OAI-DC was easier than expected – although there are still issues with identifying name pair order.
➢ Evidence based learning methods can be applied successfully to the data retrieved to enhance it – getting into FixRep territory. The project has been very useful for the purpose of establishing further use cases for ‘cleaning up’ metadata.
➢ Some interesting work in name / identity disambiguation through statistical clustering analysis. We’re looking at linking extracted info together with formal information such as that made available by the NAMES project.
➢ Storyboards defining the workflow of the system form an effective part of the agile development process, and were very useful for us.
➢ Using an SQL db as the repository was effective once problems with slow queries was addressed through: normalizing data, reviewing db schema design, adding indexes as necessary.

Fails:
➢ Natural Language Tool Kit – didn’t use it for its original purpose. Instead, went back to the Tree Tagger, although this was not specifically trained for the sort of technical document we were analysing.
➢ Text analysis expertise required for this project wasn’t already extant in the team. It would’ve been a good idea to have ensured training for team to make sure we were all on the same page!
➢ Ensure all related documents, URIs, etc, are contained/linked in the project wiki.
➢ Cultural mismatch between research approach to defining requirements/expectations and development requirements/expectations. e.g. who writes the formal requirements document?
➢ Earlier storyboard scenario development would have been helpful, so a good lesson for next time.
➢ Swine flu and its effects were quite severe on this project – our Portugese collaborators were unavailable for quite some time due to a) the danger of traveling to the UK and contracting the virus, and (subsequently to contracting the illness in Portugal) b) the effects of the illness!