Eszközök

  • hunalign - sentence aligner

    Introduction hunalign aligns bilingual text on the sentence level. Its input is tokenized and sentence-segmented text in two languages. In the simplest case, its output is a sequence of bilingual sentence pairs (bisentences). In the presence o...

  • huntag

    Huntag - a sequential tagger for NLP using Maximum Entropy Learning and Hidden Markov Models Introduction Huntag can perform any kind of supervised sequential sentence tagging tasks. It has been used for NP chunking, Named Entity Recognition, and cla...

  • huntoken

    huntoken - rule based tokenizer and sentence boundary detector for Hungarian (and English) texts.

  • Hungarian Webcorpus

    With over 1.48 billion words unfiltered (589m words fully filtered), this is by far the largest Hungarian language corpus, and unlike the Hungarian National Corpus (125m words), it is available in its entirety under a permissive Open Content license...

  • Hunglish Corpus

    The Hunglish Corpus is a free sentence-aligned Hungarian-English parallel corpus of about 54.2 m words in 2.07 m sentences. Download Search Ask Read more The Corpus can be downloaded from our ftp server. If you have any questions don't...

  • hunmorph - morphological analyzer

    Hunmorph is an open source tool and programming library for spell-checking, stemming and morphological analysing of agglutinative, german and other languages. Mailing list Our research group has been working on a Hungarian morphological an...

  • hunpars - szintaktikai elemző magyar nyelvre

    A Hunpars szintaktikai elemző magyar nyelvre. Bemenetként egy szövegfile-t kap mondatokkal, kimenetként pedig megadja a mondatok szintaktikai fáját egy egyszerű zárójelezéses jelölésben és GrahpViz dot nyelvű file-okban. Szoftver k...

  • hunpos - HMM part-of-speech tagger

    Hunpos is an open source reimplementation of TnT, the well known part-of-speech tagger by Thorsten Brants. the project has moved to Google Code:http://code.google.com/p/hunpos/ Features Free and open source, even for commercial use. For langu...

  • morphdb.hu - Hungarian lexical database and morphological grammar

    morphdb.hu is an open source morphological database of Hungarian, consisting of a lexicon and morphological grammar that are based on well-founded theoretical decisions. morphdb.hu is described in the formal representation form of hunlex, an of...