Tyreano.com

The inventions you need.

Technology

Issues to consider when locating your website

In this article, I’ll provide an overview of some things to consider when translating or localizing your website. Drawing from my experience as a translator and IT specialist, I will try to highlight not only several linguistic considerations, but also some subtle practical and technical issues to take into account.

Why is a website different from a “normal” translation project?

In the simplest case, the translation of a website may not be significantly different from the translation of normal documents. You may find that you can provide a static copy to the translator in a Word file and then extract and upload the text when you receive it in the same format.

However, many websites do not consist of a few pages of static text, which means that a website translation project may require special consideration and additional skills on the part of the translator:

  • you can have pages built “on the fly” from a database instead of existing in static files;
  • You can have a server application, for example, to process form input, which in turn generates user-visible text;
  • From a linguistic point of view, it is rare for the content of a website to refer to only one field – some IT terminology will almost certainly sneak in somewhere.

For the first two of these reasons, it is not uncommon for your website to involve text in different formats held in different files. You may have some raw HTML or text files that you can easily extract to a text file or Word document from your content management system, plus some data in a database that you may need to extract to a CSV file or SQL dump, plus some properties. files used by your backend server. In the initial stages of obtaining a quote for the project, tell the translator which file format is most convenient for you to work (and send a sample) and ask if they can work with that format. (In my case, for example, I have seen clients spend time trying to convert CSV files into Word documents and alter the text in the process, when I would very much have liked to work with the original CSV files.)

Linguistic problems

Although most websites will include some IT terminology at some point, this probably shouldn’t be the main linguistic issue involved in localizing a website. My reason for saying this is that, given the technical issues we’ll be looking at next, I highly recommend hiring a translator who is knowledgeable about IT for the website translation first.

An initial linguistic decision, but one that the translator can probably make for you, address form: As you may already know, various languages ​​use different verb forms to address the reader / listener either “informally” or “formally” (eg. you vs you distinction in French), with some languages ​​having even a three-way distinction. The appropriate form of address will depend on your target audience and the customs of the countries you are targeting; therefore, the translator may need to consult with you who your main target audience is and what impression you want to make (do you want your text to sound “serious” or more “modern”?).

Other linguistic problems arise when translating short elements of a database or a properties file, where there is sometimes a lack of context. Do you mean a “check” as in a “check”, or as in a “verification”? Do you mean “go up” as in “higher price” or as in “go to the top of the page”? And in the case of strings that can have parameters (indicated by the sequence {0}, {1}, etc. in properties files in Java and various other languages), what are the different values ​​that these parameters can have (since that can affect the translation)?

Sometimes solving these problems will require you to answer direct questions from the translator about the interpretation of your text. But as a simple measure that can save you some time and questions, I recommend using multiple property files. Let each main area of ​​your site / app have its own properties file. And in particular, allow sections of your server / site that target different people to have their own properties file. Fundamentally, if you can avoid it, don’t mix up the same file strings that are targeted at the website visitor and the strings that are part of your back-end management system.

Practical and technical problems

When you receive the translated material from the translator (or indeed, ideally beforehand!), There are one or two practical aspects to consider. You may have already noticed the differences in the word count that can occur from one language to another (typically, text in Latin-derived languages, such as French and Spanish, is about 20-30% longer than its English counterpart). This could have an effect not only on the layout of your page, but also on the size of the fields in the database. More subtly, the character count in another language may be similar, but the word count could vary drastically if that language uses compound more extensively than English (for example, you may find that a Finnish translated text has a character count similar to English, but half the number of words). A narrow column layout that works on your English page can suddenly seem disastrous when applied to the German or Finnish translation.

If your site is interactive, then you have the additional problem of accepting the input that users expect to be able to provide in their web forms, etc. This will include, for example, the ability to enter accented characters or a greater variety of characters, plus some more subtle changes to your site’s validation. In English, you may have rejected the spaces in the Surname field. But speakers of several other languages ​​usually have multiple last names and would hope to be able to enter a space in this field.

Two other problems, sometimes related, are character encoding Y sandwich. The former essentially refers to the way the computer stores / represents characters (how characters are translated into bytes). The second concerns how characters and strings are compared and classified: for example, if a me with an acute accent is considered equal to one without an accent for search purposes, and in what order they appear when sorting. Usually these issues do not arise when it comes to English only, but should normally be taken into account when it comes to text in another language.

Character encoding differs from system to system, with some common standards including ISO-8859-1, UTF-8, and other encodings such as Mac OS Roman. Depending on your website / app, you may need to make sure you have the correct character encoding set up in multiple layers:

  • when reading from the translated file;
  • when reading / writing to your database via JDBC or other application layer framework;
  • when reading data entered by the user through the Servlet API, etc.
  • in the database’s own field definitions, to ensure that they can store the required character range.

How do you know if you have the correct character encoding? A tell-tale sign of incorrect character encoding in various Latin-based languages ​​like French and Spanish is if you frequently see sequences of two accented characters side by side, possibly including a capital letter in the middle of the words. . (This happens when a file encoded in UTF-8 is incorrectly interpreted as being in ISO-8859-1 or Mac OS encoding.)

The collation (sort / compare) problem can be dealt with at the database layer (most database systems allow you to configure the collation modes for a particular column / table / database). Or it can be covered at the application layer (in Java, look at the Collator class as an alternative or extension to the raw methods Collections.sort () and String.equals ()).

Conclution

I hope I have highlighted in this article some of the main areas of concern when it comes to localizing a website and have shown that these problems can go well beyond the translation itself. Working with a translator who knows these issues could save you time and effort to make your business available in the different countries you want to target.

LEAVE A RESPONSE

Your email address will not be published. Required fields are marked *