Chaudhary, Vishrav, Tang, Yuqing, Guzmán, Francisco, Schwenk, Holger, Koehn, Philipp. “Lowresource corpus filtering using multilingual sentence embeddings.” Proceedings of the Fourth Conference on Machine Translation (WMT). Bojar, Ondřej i in. (eds.). Florence: Association for Computational Linguistics, 2019.
Dimitrova, Ludmila, Koseska-Toszewa, Violetta, Roszko, Danuta, Roszko, Roman. “Bulgarian-Polish-Lithuanian Corpus: Current development.” International Workshop: Multilingual resources, technologies and evaluation for Central and Eastern European languages held in conjunction with the International Conference RANLP-2009: Proceedings. Vertan, C., Piperidis, S., Paskaleva, E., Slavcheva, M. (eds.). Borovets, 2009: 1–8.
Dimitrova, Ludmila, Koseska-Toszewa, Violetta, Roszko, Danuta, Roszko, Roman. “Trilingual Aligned Corpus: Current state and new applications.” Cognitive Studies | Études cognitives 2014, no. 2014(14): 13–20.
Duszkin, Maksim, Roszko, Danuta, Roszko, Roman. “New parallel corpora of Baltic and Slavic languages – Assumptions of corpus construction.” Lecture Notes in Artificial Intelligence LNAI 12848: Text, Speech, and Dialogue TSD 2021. Ekštein, K. , Pártl, F., Konopík, M. (eds.). Cham: Springer International Publishing, 2021: 173–183. DOI: https://doi.org/10.1007/978-3-030-83527-9_15.
Garncarek, Łukasz, Powalski, Rafał, Stanisławek, Tomasz, Topolski, Bartosz, Halama, Piotr, Turski, Michał, Graliński, Filip. “LAMBERT: Layout-aware language modeling for information extraction.” Document Analysis and Recognition – ICDAR 2021. Lladós, J., Lopresti, D., Uchida, S. (eds.). Cham: Springer International Publishing, 2020: 1–16.
Kisiel, Anna, Koseska-Toszewa, Violetta, Kotsyba, Natalia, Satoła-Staśkowiak, Joanna, Sosnowski, Wojciech. Polish-Bulgarian-Russian Parallel Corpus. CLARIN-PL digital repository, 2016, http://hdl.handle.net/11321/308 (11.11.2021).
Machálek, Tomáš. KonText: “Advanced and flexible corpus query interface.” Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020). European Language Resources Association, 2020: 7003–7008.
Piasecki, Maciej, Walentynowicz, Wiktor. “MorphoDiTa-based tagger adapted to the Polish language technology.” Proceedings of Human Language Technologies as a Challenge for Computer Science and Linguistics. Poznań: LTC 2017, 2017: 377–381.
Roszko, Danuta, Roszko, Roman. “Polsko-litewskie korpusy IS PAN i CLARIN-PL.” Prace Bałtystyczne vol. 7. Język. Kultura. Literatura. Birgiel, Nijola, Roszko, Danuta (eds.). Warszawa: Uniwersytet Warszawski, 2018: 185–205.
Roszko, Danuta, Roszko, Roman. “Korpusy wielojęzyczne wkładem Instytutu Slawistyki Polskiej Akademii Nauk w rozwój infrastruktury CLARIN-PL: Przykłady analizy korpusowej nad wołaczem.” Języki słowiańskie dziś – w kręgu kategorii, struktur i procesów. Banasiak, Jakub, Kiklewicz, Aleksander, Mazurkiewicz-Sułkowska, Julia (eds.). Warszawa – Łódź: Instytut Slawistyki PAN – Wydawnictwo Uniwersytetu Łódzkiego, 2021: 281–313.
Roszko, Roman. “O nowych ręcznie zrównoleglonych i znakowanych dwujęzycznych korpusach równoległych oraz ich zastosowaniach.” Acta Baltico-Slavica 2021, no. 2021(45), article 2576.
Roszko, Roman, Sosnowski, Wojciech, Duszkin, Maksim, Roszko, Danuta, Tymoshuk, Roman. Polish-Russian Parallel Corpus, CLARIN-PL digital repository, 2018, http://hdl.handle.net/11321/534 (11.11.2021).
Straka, Milan and Straková, Jana. UDPipe, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University, Prag 2016, http://hdl.handle.net/11234/1-1702 (11.11.2021).
Simov, Kiril, Simov, Alexander, Osenova, Petya. “An XML architecture for shallow and deep processing.” The Proceedings of the ESSLLI 2004 Workshop on Combining Shallow and Deep Processing for NLP, ESSLLI, 2004: 51–60.
Koseska, Violetta, Roszko, Roman. “On semantic annotation in CLARIN-PL parallel corpora.” Cognitive Studies | Études cognitives 2015, no. 2015(15): 211–236. https://doi.org/10.11649/cs.2015.016 (11.11.2021).
Kocoń, Jan, Miłkowski, Piotr, Kanclerz, Kamil. “MultiEmo: Multilingual, Multilevel, Multidomain Sentiment Analysis Corpus of Consumer Reviews.” Computational Science – ICCS 2021. ICCS 2021. Lecture Notes in Computer Science, vol. 12743, Paszynski, M., Kranzlmüller, D., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M.A. (eds). Cham: Springer International Publishing, 2021.
Kocoń, Jan, Kanclerz, Kamil, Miłkowski, Piotr, Bojanowski, Bartosz, Zaśko-Zielińska, Monika. PolEmo 1.0 + MultiEmo-Test 1.0 Multilingual Sentiment Analysis Dataset for KES2020, CLARIN-PL digital repository, 2020, http://hdl.handle.net/11321/737 (11.11.2021)
Kocoń, Jan, Kanclerz, Kamil, Miłkowski. MultiEmo: Multilingual, Multilevel, Multidomain Sentiment Analysis Corpus of Consumer Reviews, CLARIN-PL digital repository, 2021, http://hdl.handle.net/11321/798, (11.11.2021).
Google Scholar