VectorLinux
November 26, 2014, 01:46:04 am *
Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
News: Visit our home page for VL info. To search the old message board go to http://vectorlinux.com/forum1. The first VL forum is temporarily offline until we can find a host for it. Thanks for your patience.
 
Now powered by KnowledgeDex.
   Home   Help Search Login Register  
Please support VectorLinux!
Pages: [1]
  Print  
Author Topic: Transferring files with Khmer names  (Read 1521 times)
Andy Price
Packager
Vectorite
****
Posts: 237


« on: August 18, 2009, 07:31:28 am »

Hi everyone

I'm working on a project to produce teaching materials in Khmer using the Khmer version of OpenOffice running under Windows with Khmer unicode installed. My colleagues have produced a number of files in Writer and Calc and they have file names including both Khmer and Enlgish characters. I can move them around in Windows ok and I can copy them from my USB drive to my home directory in Vector 6.0 Gold, but I can't copy them back to the USB drive. When I try I get an error message such as: "Failed to open "/media/TOSHIBA/???????? Songs.odt" for writing (Invalid argument)". As you can see, Khmer characters show up as ??? in the file name in Vector. I have Khmer TTF fonts installed on the system and they show up correctly inside the documents.

Khmer isn't listed in the language choices on GDM, and in any case I don't really want to convert my installation to Khmer, even temporarily (I don't speak Khmer myself) - I just want to be able to store the files on my hard drive and copy them onto a USB drive as necessary. Any ideas?

Thanks
Andy
Logged
Daniel
Packager
Vectorian
****
Posts: 704


WWW
« Reply #1 on: August 18, 2009, 07:57:25 am »

I don't know but you might be able to put the Khmer file into a zip file with no Khmer characters in the zip file's name and transfer it that way.
Logged

The following sentence is true. The previous sentence is false.

VL 6.0 SOHO KDE-Classic on 2.3 Ghz Dual-core AMD with 3 Gigs of RAM
Andy Price
Packager
Vectorite
****
Posts: 237


« Reply #2 on: August 19, 2009, 07:09:18 am »

Thanks for the suggestion Daniel. I remember trying zip files some time ago but decided to give it another try anyway.

The first interesting thing is that when I try to zip files with Khmer names using Windows' built-in zip utility it gives an error and refuses to work. This is in line with the fact that when I try to use TaskZip to back up documents in school the resulting archives are empty of files with Khmer characters in their names (though no error messages are given).

Back to Vector. If I zip the files using Xarchiver and give the archive an English name it copies to the USB drive ok, but when I transfer the archive to a Windows machine and try to view the files they are not visible using Windows' built-in zip utility - the archive appears empty. If I use Portable PeaZip in Windows then I can extract the files but the Khmer part of the file name comes over as a series of dashes. Even booting the Windows machine to Puppy Linux and extracting the files that way loses the Khmer characters. So it seems that the zip format just can't handle Khmer file names.

If that's the case, is there anything else I can try?

Logged
brokndodge
Member
*
Posts: 83


Linux is sooo HOT


WWW
« Reply #3 on: August 19, 2009, 06:35:08 pm »

found some interesting reading on unicode:

http://www.cl.cam.ac.uk/~mgk25/unicode.html#apps
http://en.wikipedia.org/wiki/WinZip

it seems that unicode is far from completely supported in linux applications.  many of the gnu tools we take advantage of from day to day, may not be fully patched to accept unicode characters.  the winzip wikipedia article says that you must use winzip >= 11.2 for unicode support.  i don't know the target version the linux tools support, but the windows side should be solved by installing the latest winzip, rather than using the builtin tool. 

according to responses in this thread:
http://blogs.msdn.com/michkap/archive/2005/05/10/416181.aspx
rar and 7-zip handle unicode flawlessly.

as to problems transferring to the usb, i believe usb drives use a fat32 partition.  fat32 is a legacy partition and the linux tools may not be completely up to date as ms actively seeks to kill anyone that wants to read, write, execute to or from fat32.  maybe windows has updated that partition type for unicode support, but due to it's age it may not have been updated in linux. some type of rar archive may be in order.  i'd start with 7-zip.  it's open source and cross platform.  your milage may vary, but please let us know if it helps.
 
Logged

VL 7.0 Standard

brokndodge
- OSS is not a religion, it's the solution to buggy irresponsible coding -
Linux User# 494720
Andy Price
Packager
Vectorite
****
Posts: 237


« Reply #4 on: August 21, 2009, 01:56:03 am »

Thanks for the links and suggestions. 7-zip works nicely as long as I use the 7z archive type. I can now create an archive in Windows and save it in Vector with the Khmer file names intact. If I extract it in Vector I lose the Khmer names, but I can get away without doing that.

Looking back at what I originally wrote I think I was incorrect - I never did transfer the files correctly to Vector from the USB drive. I assumed that the ??? in the file names just meant that the Khmer characters couldn't be displayed, but in fact I think they were already lost.

And the reason I couldn't transfer them back to the USB drive was - as pointed out - because of the FAT32 file system. The ? is an illegal character in DOS file names, but when I formatted the USB drive as Ext2 the files tranferred ok.

Thanks
Andy
Logged
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2013, Simple Machines Valid XHTML 1.0! Valid CSS!