perl - Could File::Find::Rule be patched to automatically handle filename character encoding/decoding? -
suppose have file name æ (unicode : 0xe6, utf8 : 0xc3 0xa6) in current directory.
then, use file::find::rule locate it:
use feature qw(say); use open qw( :std :utf8 ); use strict; use utf8; use warnings; use file::find::rule; $fn = 'æ'; @files = file::find::rule->new->name($fn)->in('.'); $_ @files; the output empty, apparently did not work.
if try encode filename first:
use encode; $fn = 'æ'; $fn_utf8 = encode::encode('utf-8', $fn, encode::fb_croak | encode::leave_src); @files = file::find::rule->new->name($fn_utf8)->in('.'); $_ @files; the output is:
æ so found file, returned filename not decoded perl string. fix this, can decode result, replacing last line with:
say encode::decode('utf-8', $_, encode::fb_croak) @files; the question if both encoding , decoding could/should have been done automatically file::find::rule have used original program , not have had worry encoding , decoding @ all?
(for example, file::find::rule have used i18n::langinfo determine current locale's codeset utf-8 ?? )
yeah, wish. if there's major perl project i'd work on, it.
the issue there badly-encoded file names, including file names encoded using different encoding expected. means first thing needed way of round-tripping badly-encoded file names through decode-encode process. think python uses surrogate pair code points represent bad bytes.
you need pragma ensure backwards compatibility.
Comments
Post a Comment