As different research teams are facing the same problems worldwide, some similar, some new some quite imaginative approaches are beginning to emerge, for example:
- Lemmatisation, annotation for morphologically-rich languages, for example Czech and Basque and even lesser resources in the case of the 2nd one.
- Syntax-based approaches and word re-ordering for very unrelated languages (such as Asian or Semitic languages into and out of European languages)
- Web-based annotation tools
- Hybridisation of techniques, starting from analysis at a morphological layer, then analytical layers, tectogrammatical layers, and then transfer, and on to synthesis to t-layers, a-layer and m-layer.
- Word disambiguation
- Mixture of rule-based and statistical approaches to improve predictability.
- Post-editing effort estimation for MT systems and with systems including no linguistic features or having some. Linguistic features are relevant for direct useful error detections and for automatic post-editing. But for sentence-level CE there are issues with sparsity and with representation (length bias).
- New metrics like VERTa, using linguistic knowledge organised in different levels (lexical, morphological, syntactic information and sentence semantics)