aConCorde - v0.4.1 - README Andrew Roberts (16th August 2004) andyr [at] comp [dot] ac [dot] uk Overview ======== aConcorde is a multi-lingual concordance tool. Originally developed for native Arabic concordance, it posses basic concordance functionality, as well as English and Arabic interfaces. Written in Java, so will run on any platform that has the Java Runtime Environment installed (which is available for most major operating systems. It currently has the following features: * Full Arabic support (no need to transliterate to ASCII before concordance). * English and Arabic native interface * Multiplatform - runs on Linux and Windows (including non-Arabic Windows) - requires Java Runtime environment. (See support for help). * Supports Unicode (UTF-16), UTF8, CodePage Windows-1256 and ASCII encoding. * Word frequency analysis. * Concordance can be sorted on left or right contexts. * Save concordance output to file (as plain text or HTML aligned tables). Finally, aConCorde is free :) What's new in v0.4.1? ===================== * This version tweaks the 0.4 version (see details in next section). * Loading of corpora is significantly quicker. * Performance of concordance output is also faster. * A new configuration file (aConCorde.properties) allows you to set the default context size permanently. * Fixes UNIX build.sh bug. What's new in v0.4? =================== A number of improvements have been implemented to make aConCorde more useful for language analysis. To summarise: * Support for multiple corpora being opened at the same time. * Punctuation is now included in concordance output and no longer disregarded. * Ability to now save concordance output to file (as either plain text or HTML aligned tables in a number of encodings). * Extra text encoding standards: CodePage Windows-1256, IBM Arabic (Cp420), ISO Latin/Arabic (ISO8859_6) and MacArabic. * A number of little bug fixes. Installation ============ 1. Download the aConCorde utility from the project web page (http://www.comp.leeds.ac.uk/andyr/software/aConCorde). It will be bundled into a single zip archive, e.g, aConCorde-0.4.1.zip. 2. Unzip the file to a location on your system that you want to install the tool. [For Windows users, WinZip (http://www.winzip.com) is a popular utilty for opening zip files and is available as shareware)] 3. The files necessary for aConCorde are now on your system ready for use. Running aConCorde ================= NOTE: Before you can run aConCorde on your system, you must ensure that the Java Runtime Envinronment (version 1.4 or greater) is also installed on your computer. This can be easily downloaded from http://java.sun.com and installed. See the "What is Java?" section below for information about what Java is, and how to install it. To run aConCorde from the command line: 1. Go to the directory where aConCorde was installed (this will contain all the Java .class files). 2. Type: 'java aConCorde' (without the quotes) 3. If you want to start with the Arabic interface, type: 'java -Duser.language=ar aConCorde' (without the quotes) Linux users: Alternatively, there are two simple shell scripts that can run aConCorde in a single command. At the command prompt, './aConCorde' will run the program, and './aConCorde_ar' will run aConCorde with the Arabic native interface. To avoid the command line, you can create menu items and/or desktop shortcuts to these scripts, which then allow you to run with a mouse click. Windows users: To run aConCorde without using the command line, navigate to the aConCorde folder using My Computer or Explorer, and then double-click the file called 'aConCorde.bat'. This is a simple script that runs the above command line. Run 'aConCorde_ar.bat' to start with Arabic interface. You can make shortcuts to these .bat files from your desktop or Start menu for easier access. Limitations / Bugs ================== As you can see from the version number, this is a very early release. I thought that it would be better to release it anyway, even though it's far from a complete product. The release approach will be little and often. However, as a consequence of this early release there are also a number of shortcomings: * Swapping languages will close current corpus session. Therefore will need to ensure you are in the correct language before opening your corpus file. * Search box will only accept a single word, and not a phrase. * aConCorde does not yet know about corpus markup! Therefore, any markup, such as XML tags will be treated like normal words. * No progress monitors have been implemented. This means, when loading a large file for example, or obtaining concordance for a word with many instances, it may take a long time to complete. Whilst this happens, the program will appear to freeze. However, it will come back to life when the task has completed. In future versions, a progress indicator will be displayed to inform the user about the progress of the task (which also confirms that the program is doing something and not frozen!) In the meantime, just be patient! * There are still a few issues with the Arabic native interface. Some of the standard dialog boxes that come with Java, such as the file selection window for loading and saving files is still in English. I've not had chance to work out how to alter these pre-built dialogs. Still, they should be fairly self-explanatory. What is Java? ============= Java can be viewed as two things. Firstly, it is a programming language, and a fairly good one at that! It is designed and developed by Sun Microsystems (http://www.sun.com) by some of the best computer scientists and engineers. It has many advantages for software developers that has lead to Java becoming one of the most popular development languages currently in use. Secondly, it is a 'virtual machine'. Typically, when a developer writes some software, they have to compile their code. The compiler converts the program code into machine code, which is specific to the platform that you are using. For example, programs compiled for Windows do not work on Mac OS. If you want programs to run on other platforms, then the developer often has to spend a great amount of effort changing the source code to conform to the specifications of a given platform, and then compile, again! However, one of Java's main strengths is its ability to "write once, run anywhere". What this means is that when some software is written in Java, it can run on any platform, without the need of modification. Although, for the program to run, it needs the Java Virtual Machine. The Java Virtual Machine is a special piece of software written by the developers of Java. They have gone to great lengths to make it so that there is a version of the Virtual Machine for most major platforms. For example, you can get it for Windows, Linux, Mac OS, Solaris, IRIX, etc. Once you install a virtual machine for your particular platform, then you are ready to run Java applications. The Virtual Machine removes the need to compile an application to machine code for a specific platform. Instead, the Virtual Machine runs on top of a platform, and acts as an interpretor for the Java code. This is why, before you can run Java applications, you need the Java Virtual Machine. Upgrading or Installing Java ============================ All official Java resources are available from Sun's Java website (http://java.sun.com). For aConCorde, you need to have Java 1.4 or greater. If you already have Java installed, you can check its version by typing 'java -version' from a command line. If it is not version 1.4 or greater, then you will also need to install an up-to-date version of Java. When you go to the Java website, it can be a little overwhelming if you are not familar with Java. It is possible to find everything you need, however, Sun Microsystems have created an alternative site for those less interested in the technical aspects of Java. Goto http://java.com/en/index.jsp and you will see on the top right a panel containing "Free Download. Java software for the desktop." and below a button labelled "Get it now". If you click this butto, the website will automatically detect your operating system, and then direct you to a page specific to that platform. This page will contain a links for the latest version of the Java Virtual Machine (also known as Java Runtime Environment). Download this file, and then follow the installation instrustions that are also present on that download page. Building aConCorde from Source ============================== aConCorde is released under the GNU General Public Licence (GPL)*. Therefore the source code is open to any one who wants it :) Feel free to delve in and see how aConCorde was built. Obviously, a pre-compiled version of this release is available from the project website. However, if you enjoy compiling code, then here are some instructions. This is only for more confident Java developers. NB, you will need Java SDK 1.4 or greater to compile this project. 1. aConCorde currently uses the Regex package from Stevesoft (http://www.javaregex.com/home.html). This is also available under the GPL. You must make this package available on your PC and then ensure that you put it in your classpath so the compiler can find it. 2. Grab the source code zip file: (http://www.comp.leeds.ac.uk/andyr/software/aConCorde/releases/0.4/aConCorde-0.4_src.zip) 3. Uncompress it to a suitable location on your PC. 4. Within the zip file I included a couple of simple build scripts. One for Unix based OSes called build.sh. This will pass each .java file into the Java compiler in the required order. For Windows, build.bat is provided, which when run, will do the exact same thing. 5. Done! 'java aConCorde' execute aConCorde. * My code is forever protected by the licence, which means that if you wish to include my code within your own project, then you must also release that project under the GPL. Contact ======= If you wish to contact the developer about aConCorde to suggest future features, bugs or anything that you want, please email me at: andyr [at] comp [dot] ac [dot] uk * Anti-spam format. Please remove all spaces, and replace '[at]' with the '@' symbol (no quotes), etc.