Normand,
tagging documents would indeed not be feasible under certain conditions:
1. How would one represent shared documents, those which are used (coded) more than once in a single or several projects? This again multiplies the overlap problem you mentioned.
2. How would you tag documents already tagged in a different syntax (you mentioned HTML yourself)?
3. How would you tag archived read-only docs or such for which embedded tagging cannot be done at all (images, audio, video).
XML tagged docs would indeed ask for an agreed standard and that is indeed a "longitudinal" enterprise.
Although ATLAS.ti does not attempt to tag the documents to be analyzed we do have an XML based export/import scheme for "project" data running for almost 6 years now.
By project data we mean document meta data, codes and descriptions, memos, network structures, authors, definitions of links and relations, queries, content types and "codings" (positional references into the analyzed data). By the way, the XSD scheme can be looked at at:
http://www.atlasti.com/downloads/atlasti_hu.xsd (used since 2004).
The powerful thing about the XML standard is that you do not neccessarily have to wait forever for a domain specific "QDAML" standard but you can start right away to open your application in a non-proprietary fashion. Using XML to describe your projects greatly eases the burden of migrating such data for other application to exploit this data. Of course, the number of neccessary translators (stylesheets) interpreting the XML export of a specific tool would be large compared with having a standard QDAML, but at least there is a standard language for defining such translators.
You are right in that a standard should be acquired for tagging textual data where it suits and it would be beneficial to have a standard for representing models, the codes, the codings, the networks, links, etc.
One that would work in Qda-miner, ATLAS.ti and any other app that opens up its proprietary shields.
- Thomas
At 20:12 14.06.2006, you wrote:
>At 6/13/2006 08:00 AM, Thomas Muhr wrote:
>>One important aspect regarding long termed longitudinal studies is "standards". There are several threats associated with using tools over a longer period of time. One is, that the original tool is simply not available anymore for what reasons ever (which will never happen to us, of course!-). Or you might wish to analyze your data with different tools at some stage of research. For both circumstances an approved standard for the project data might become important. ATLAS.ti supports such standards.
>
>Thomas,
>
>Well! QDA Miner v2.0 also, and until Atlas-ti 6.0 is released next year, I would say that QDA Miner is the only major QDA software that can be used to export tagged documents to XML. But the point I would like to make is something completely different. While I agree that supporting XML is something useful, I would like to express some reservations about the true value of XML. I would not say that XML is a standard, it is more a structured language in which standards could be developed and are being developed. Actually, there are currently hundreds of standards developed in XML for various purposes, such as for the storage of taxonomies, thesaurus, linguistic coding, as well as for numerous industries. The first XML standard we chose to support in QDA Miner was Triple-S, an XML standard to exchange survey data. When we started to look at standards for hierarchical taxonomies, we found 3 XML standards, none of those was widely used.
>
>This leads me to identify a few problems with XML. First, importing data in XML can be quite problematic since, unless we follow more specific standards when choosing the name of elements, mapping the structure of an existing XML file to an application will be very difficult. On can still use some XML editors but they are not the kind of tools that can be learned quickly. While for HTML, we know pretty much which application has to be supported (i.e. the browsers) there is no such tools in XML (except maybe the syntax checking tools that can only be used to make sure you followed the XML language conventions).
>
>Another major problem with XML when applied to QDA is that it currently does not provide conventions for overlapping markups, something that is quite common in QDA. XML is a highly structured version of SGML where elements must be structured hiearchically (they say "well-formed") so that tag contents should never overlap. But sometimes you need them to overlap and there has been many proposals for solving this issue (the Text Encoding Initiative and the OSIS group proposed the use of two different kinds of milestones, and some suggested using other existing SGML or XML features, or adding new ones). I don't know which strategy Atlas-ti 6 will use for exporting tags in documents, but we decided to use a method somewhat similar to the one propose by the OSIS group. Although we follow the XML rules, by doing this, most XML editing tool available today won't be able to correctly interpret the significance of those special "milestones". If you choose another solution, then people may never be able to exchange coded documents between QDA Miner, Atlas-ti and maybe even between any other software supporting XML. If you choose a similar solution than the one we chosen, then the only software that will be able to use those markups produced by Atlas-ti will be our software (and vice versa).
>
>This brings me to a last problem with XML. Since overlapping codes are not allowed or not easily implemented in XML, then any text formatting has to be dropped from the document. You cannot use <b> </b> to put things in bold or <i> </i> to make this text italic. You may be able to do this, but you would have to make sure those codes never ovelaps. HTML can do this, simple because it breaks the "well-formeness" rule of XML. (This is also the reason why XHTML may also be problematic since they are proposing it to eliminate overlapping codes in HTML 4).
>
>I would say that in order for XML to be really useful for qualitative researchers, then we would have to sit, you and me, and all those interested in this idea of allowing easy exchange of data between QDA tools, transcription tools, etc, and develop our own standards for the QDA community. We may also decide which other standards should ideally be supported. Should we follow the Text Coding Initiative group (and wait for them to solve the markup overlap problem) or the OSIS group (the Bible Technolgies Group) which has also proposed some solutions for this? Should we develop our own standards and our own solution and make sure those standards fulfill our need?
>
>Until we have our own standards, I would say that the easiest way to exchange tagged documents would be to use either plain text (or unicode text), HTML, and maybe even RTF. XML has the potential of contributing to the development of standards for the QDA community, but we are just not there yet.
>
>Best regards,
>
>Normand Péladeau
>Provalis Research
_______________________________________________________________________
„Computers, like every technology, are a vehicle for the transformation
of tradition“ (Winograd & Flores, 1987)
ATLAS.ti Scientific Software Development GmbH - Berlin - www.atlasti.com
|