python - How to convert utf8 to cp1251 to write ID3_V1 tag of mp3 file? -
ID 3_V1 only supports latin1
encoding. To write a V1 tag with Russian characters, cp1251
encoding is used. I want to copy data from V2 tag (Unicode) to V1 tag. I get V2 tag with the following code with the use eyeD3:
tag.link (mp3path, V = eyeD3.ID3_V2) mp3album_v2 = tag.getAlbum () ... tag.link ( mp3path, V = eyeD3.ID3_V1) Tagksettekst encoding (Id 3. Latin L_ancoding) Tagksetalbm (MP3 Gnti_viksn.aksod ( "CP1251")) # tag. update ()
The following is returned:
& gt; & Gt; & Gt; Print mp3album_v2 Жить в твоей голове & gt; & Gt; & Gt; Print Type (MP3 LBM_V2) & lt; Type 'Unicode' & gt; & Gt; & Gt; & Gt; Print wrapper (mp3album_v2) u '\ u0416 \ u0438 \ u0442 \ u044c \ u0432 \ u0442 \ u0432 \ u043e \ u0435 \ u0439 \ u0433 \ u043e \ u043b \ u043e \ u0432 \ u0435'
Looks like set albums
utf-8
string (?):
def set albums (self, a): self. SetTextFrame (ALBUM_FID), auto TrustUnodod (A)); Def strToUnicode (self, s): t = type (s); If t! = Unicode and T == str: s = Unicode (S, eye D 3.LOCAL_ENCODING); Alif T! = Unicode and T! = Str: Increase tag extension ("wrong type with strToUnicode:% s"% str (t)); Return s;
But if I try to do tag.setAlbum (mp3album_v2.encode ('cp1251'). Encode ('utf-8'), then I am getting an error
Unikoddekod error: 'utf8' codec can not decode byte 0xc6 position 0: invalid continuation bytes
ID3v1 can not contain any non-ASCII character firmly, you can write CP1251 encode bytes in the ID3v1 tag but they only install the Russian Locale OS. Breakfast and still be presented as not Cyrillic all applications.
Unicode wire chooses to use internal and arbitrary use with EyeD3 deals Latin 1
(aka ISO-8859-1) encoding for id 3v1 tags as it Probably not a good choice because latin1
is never the default locale-specific encoding on the Windows box (for Western Europe it is actually cp1252
which is similar but not the same ).
However, the property of this option of encoding is that each byte maps it in a Unicode character with the same code point number, you can take advantage of it by creating that Unicode string in which the letters are included , When encoded as latin1
, the encoding of latin1 will end byte encoding of a selected string in an encoding
.
ALBUM_NAME = u'Жить в твоей голове 'mangled_name = album_name.encode (' CP1251 '). The decoding ('latin1') tag.setAlbum (mangled_name) # will be converted to a latin1 encode, resulting in CP1251 bytes
This is a terrible hack, of suspicious advantage, and One of the reasons you should avoid ID3v1.
Comments
Post a Comment