I have an UMLAUT table. Pretty simple really…
CREATE TABLE "HR"."UMLAUT" ( "CHARACTERS" VARCHAR2(2) ) ; REM INSERTING INTO UMLAUT SET DEFINE OFF; INSERT INTO UMLAUT (CHARACTERS) VALUES ('Ñ'); INSERT INTO UMLAUT (CHARACTERS) VALUES ('Ä'); INSERT INTO UMLAUT (CHARACTERS) VALUES ('Ñ'); COMMIT;
Looks great in SQLDev, but not so much in SQLcl…
Everything is UNICODE by default in Java applications. It’s converted to Unicode coming in and going out… [docs]
This is a HUGE advantage in the Java world. We don’t have to do anything special when it comes to supporting Unicode.
So what’s happening above?
Windows CMD by default isn’t setup to use a Unicode code page.
So yeah, that’s not going to work.
UNTIL you do this.
The CHCP command changes the code page for the CMD window. And the 65001 code page is for UTF-8.
But, you probably don’t want to have to do this every time you open a CMD window.
So, buy a Mac and use a proper BASH shell environment 😉
Or, you can follow the advice here on StackOverflow. You basically edit the Windows registry to start CMD out with the Unicode code page.
23 Comments
Hi Jeff,
I tried the original workaround provided and some of the suggestions provided by the above forum,
I am able to see the content clearly from SQL developer,
https://imgur.com/a/GmMaOn5
but when I use SQLcl I am not able to see the same. Then I tried changing the default character encoding. also tried set encoding option , but still no luck.
with chcp 65001 , I am seeing
”Dépôt de ch que Serv banc mob – 4474″
the original value should be – “Dépôt de chèque Serv banc mob – 4474”
I want to extract some data to a text file through SQLcl in batch / background mode. All french content data is having this issue when it is accessed through SQLcl.
The current version of SQLcl (18.3.0.0) no longer includes a bin\sql.bat loader. The “-Dfile.encoding=UTF-8” argument can also be passed to Java by creating a Windows environment variable:
JAVA_TOOL_OPTIONS=’-Duser.language=en -Duser.region=US -Dfile.encoding=UTF-8′
(the variable value should not have quotation marks)
You’ll know you did it right because the first line shown when instantiating SQLcl will say “Picked up JAVA_TOOL_OPTIONS…”:
H:\>sql.exe user/password@host:1521/sid
Picked up JAVA_TOOL_OPTIONS: -Duser.language=en -Duser.region=US -Dfile.encoding=UTF-8
Hi,
We had a great time this past week trying to configure SQLcl in Windows 7.
In the end the following 3 Items had to be set.
1. Font in the properties of the SQLcl window has to be set to a true type font. (right-click the title at the top of the window to access)
This can be set permanently by changing the defaults font for command prompt window (CMD) located in the windows\system32 folder and the one located in windows\sysWOW64.
(not sure which of the 2 SQLcl pulls it’s font setting from but it’s one of them)
2. Set the encoding in SQLcl.
set encoding cp1252 in my case.
An easy way to find the correct one is to run the ‘show encoding’ command in SQL Developer’s editor.
3. Set to the Code Page for the HOST
in sqlcl this can be done with the following command:
HOST CHCP 1252
And on a Mac you have to do… nothing. It just works.
Windows CMD sucks.
I don’t know if Powershell is any better though.
Using 4.1.5.21.78 on Windows, the sql.bat script changes the codepage to 65001 (Unicode).
However, it does not restore the previous codepage before exiting. This causes subsequent applications to not work as expected.
SQLcl: Release 4.2.0 Production on Tue Apr 11 10:26:01 2017
Still not fixed in sqldeveloper-17.4.1.054.0712-no-jre.
If sql.bat is going to change the codepage, then sql.bat must reset it to the codepage the user was using before sql.bat started.
Hi Jeff,
This is an old thread, but I have some interesting findings with 4.2.0.16.355.0402 on Windows 10.
– This is not yet mentioned in this thread : you should switch your console font from “Raster Font” to a truetype font, e.g. “Consolas”. Your link to chcp explains this.
– The chcp 65001 is already set in sql.bat, and makes sure the output in sqlcl is always ok, whatever is the value in “show/set encoding”, except when you do “set sqlformat ansiconsole”. Then it is wrong and you need to add ” -Dfile.encoding=UTF-8″ to STD_ARGS in bin\sql.bat to get it right. Setting encoding to UTF-8 is not enough. When you add the file.encoding, then the “show encoding” will default to UTF-8.
– Since the bin\sql.bat changes the chcp permanent, we have to reset it to original settings afterward, or you have to change NLS_LANG to use AL32UTF8 or UTF8 ( did not see any differences ). But you have to be aware that all input and output files will have to be in UTF-8 encoding.
– Using cmd.exe or powershell with codepage 65001 has some strange results which scare me away : sqlplus prints fine, except when a strange character is on the first column of the line, then it is not OK. If you spool to a file and display the contents with “type”, then it is fine. It might be something in sqlplus 11.2
– If you test with non-printable characters, you will see different things depending on the TTF font choosen, e.g. the separators:
select ‘FS=’||chr(28)||’,GS=’||chr(29)||’,RS=’||chr(30)||’,US=’||chr(31)||’;’ from dual;
Regards,
Dirk
Reading your post, I was hoping that this time around, we can actually use sqlcl in production. Not so. Our database has EE8MSWIN1250 code page. Our clients are Windows 7 and 10 (CP1250).
Our result of changing cmd to chcp 65001 and the latest sqlcl is this:
On Windows 7
~~~~~~~~~~~~~
– sqlcl will display extra charaters. We enter 10 umlauts, we’ll get extra output, like this:
ŠšČčĐđĆ掞
Ć掞
ž
Only the first line should display.
– real show stopper is that we can’t write *any* umlauts on command line of sqlcl if chcp is set to 65001 (instead of umlaut, we see square)
Windows 10
~~~~~~~~~~~~~
– sqlcl will display umlauts correctly if we set chcp to 65001 (no extra characters)
– the problem of entering umlats on command line of sqlcl is the same as on Windows 7. Show stopper.
The end result is that sqlcl is (still) completely unusable on Windows 7/10 in real life. At least for us, using 8-bit DB (EE8MSWIN1250) on backend and Win7/10 clients with Eastern Europe (CP1250) codepage.
That said, I found your recommendation (link tothe stackoverflow) of tweaking cmd globally in registry very dangerous. chcp 65001 makes cmd.exe (and powershell as well) broken. And what about all the other uses of cmd (sqlldr, sqlplus, etc.).
Linux & Macs are the only real alternative, this certainly works for me, but all our developers use Windows 7 or 10 and sqlcl is unfortunately of no use for them :-(. Pity.
Regards,
AlesK
I
Maybe you need a different code age. Win 10 working and Win 7 not working seems to be the problem isn’t with SQLcl, but with your setup, somewhere.
How does a UTF-8 code page break cmd.exe? As always, test and verify. If it works, then change the registry to default to that. If it doesn’t work, then don’t do it.
Did you check the fonts as well? That plays a big piece in the equation, you have to pick a font that supports the character you’re working with.
Can you give me a table with a set of inserts I can test against?
SQLcl isn’t ready for production yet, it’s still an Early Adopter. That’s about to change though.
Hi Jeff,
Win 10 works 50% :), select’s works, sort of, we can’t use umlauts nowhere in the WHERE condition, which is a deal breaker. But the output at least doesn’t reproduce extra characters.
> How does a UTF-8 code page break cmd.exe?
Sorry, bad wording. Setting UTF-8 in registra makes cmd.exe unusable for other work. Let’s say that I want to insert some line with umlauts in SQL*Plus (setting NLS_LANG to SLOVENIAN_SLOVENIA.UTF8 beforehand). The result immediate sqlplus crash (11.2.0.4). No one can use SQL*Plus to run existing scripts any more….well, that’s why I consider cmd.exe being “broken” if we set to UTF-8.
You can use table you posted, just insert:
insert into UMLAUT values(‘Šš’);
insert into UMLAUT values(‘Čč’);
insert into UMLAUT values(‘Žž’);
Of course, if you wish to reproduce my real environment, then you’ll need at least db with EE8MSWIN1250 characterset — if that’s relevant at all.
Regarding fonts, I’m aware of them, they were suspect #1, so I tried every one of them without success. And it’s not just my machine, we tested this on 8 of them, from Lenovo Windows default installations that shipped to us with factory default installations, to our Windows Enterprise installment.
I’ll try my luck with ConEmu.
Regards,
Ales
I see what you mean on the WHERE clause bit. Digging now.
Hi Jeff,
some good and bad news about sqlcl and UTF8 on Windows.
I think you’ll have to patch official launcher (sql.bat) file, adding parameter -Dfile_encoding=UTF-8, my line looks like:
java -Dfile.encoding=UTF-8 %STD_ARGS% %DEBUG% -cp “%CPFILE%” oracle.dbtools.raptor.scriptrunner.cmdline.SqlCli %*
That’ll partly solve the issue I was complaining above. Instead of seeing square instead of umlaut, we see proper umlaut followed by the square. Better, but still weird. No matter what we did, cmd.exe and powershell behaves like this.
The only workable solution at this point is to use ConEmu instead of cmd.exe (of course, java -Dfile_encoding=UTF-8….is still mandatory in sql.bat).
HTH,
Ales
Forgot to mention that ConEmu is mandatory for Windows 7 clients (99% of my “clients”), on Windows 10, cmd.exe works as it should if -Dfile.encoding=UTF-8 swith is added to sql.bat.
Regards,
Ales
we added in latest SQLcl…
I am HR on orcl > help set encodi
SET ENCODING
SET ENCODING { UTF8,GBK,…}
I am HR on orcl > show encodings
current Encoding:Cp1252
List of available encodings:
Big5
Big5-HKSCS
CESU-8
Cp1252
EUC-JP
EUC-KR
GB18030
GB2312
GBK
IBM-Thai
IBM00858
IBM01140
IBM01141
IBM01142
IBM01143
IBM01144
IBM01145
IBM01146
IBM01147
IBM01148
IBM01149
IBM037
IBM1026
IBM1047
IBM273
IBM277
IBM278
IBM280
IBM284
IBM285
IBM290
…
Thanks Jeff, but the SET ENCODING does not work for us.
On Windows 7:
the only workaround is to replace the line in sql.bat
SET STD_ARGS=-Djava.awt.headless=true -Xss10M
with
SET STD_ARGS=-Djava.awt.headless=true -Xss10M -Dfile.encoding=UTF-8
and use ConEmu instead of cmd.exe/powershell.ece.
On Windows 10: the change in sql.bat is enough, you don’t need ConEmu.
You can see the complete example here:
https://dbaportal.eu/2016/11/17/unicode-and-oracle-sqlclon-windows-solved/
Regards,
Ales
you don’t need unicode. it should be fine with
C:> set NLS_LANG=american_america.we8pc850
SQL> select distinct CLIENT_CHARSET from v$session_connect_info where sid in (select sid from v$mystat);
CLIENT_CHARSET
—————————————-
WE8PC850
SQL> select * from umlaut;
CH
—
Ñ
Ä
Ñ
we don’t read os env variables to set NLS
Nice trick.
You could also just run bash natively on Windows 10. Or use Fedora 24.
Hi Jeff,
You could also create a cmd shortcut or a cmdlaunch.cmd batch file and pass the chmd command as input parameter using /k switch.
Surely, it is not bash 🙂
I think the registry/permanent solution is the way to go, only b/c i normally go, Start > Run > CMD
Or use ConEmu on Windows. 🙂
https://conemu.github.io
I’ve used it. I imagine Windows users won’t though. I’m waiting for MSFT to build a decent terminal.