Skip to main content

ftp服务器返回的文件列表解析

EG:
drwx------   3 user group            0 Dec  2 12:09 Java_server_proxy
drwx------   3 user group            0 Dec  1 12:42 agent
drwx------   3 user group            0 Nov 26 09:21 download
drwx------   3 user group            0 Nov 27 07:18 gae-server
drwx------   3 user group            0 Nov 28 16:11 openssl
drwx------   3 user group            0 Nov 26 08:09 python
-rw-------   1 user group        15672 Dec  2 12:00 Java_server_proxy.rar
-rw-------   1 user group        18650 Dec  2 12:04 Java_server_proxy.zip


#将空格规范后的文件列表
def parse_file_list(parsed_list,file_quantity):

    all_file_name=""
    all_file_create_data=""
    all_file_size = ""
    all_file_user=""
    all_file_operator_limit=""

    for i in range(0,file_quantity):
        
        i_handle_line = parsed_list.split("\r\n")[i]
        
        #handle a file
        the_line = i_handle_line.split(" ")
        file_name = the_line[8]
        file_create_data = the_line[7]+"    "+"created at   "+the_line[6]+" "+the_line[5]
        file_size = the_line[4]
        file_user = the_line[3]+the_line[2]+the_line[1]
        file_operator_limit = the_line[0]


        all_file_name+=file_name+"\r\n"
        all_file_create_data+=file_name+" "+file_create_data+"\r\n"
        all_file_size +=file_size+"\r\n"
        all_file_user +=file_user+"\r\n"
        all_file_operator_limit+=file_operator_limit+"\r\n"

    return all_file_name,all_file_create_data,all_file_size,all_file_user,all_file_operator_limit


#将得到的文件列表 decode("utf-8")后传入
def Get_File_Name(file_list):
    #file number
    file_quantity = len(file_list.split("\n"))-1

    parsed_list = ""

    for i in range(0,file_quantity):
        #将多个空格转为一个
        one_line = re.sub(r'\s+',' ',file_list.split("\n")[i])
        parsed_list+=one_line+"\r\n"
    
    print parsed_list

    return parse_file_list(parsed_list,file_quantity)

Comments

Popular posts from this blog

Elasticsearch error when the field exceed limit 32kb

When we post data that the field exceed the limit, elasticsearch will reject the data which error: {"error":{"root_cause":[{"type":"remote_transport_exception","reason":"[cs19-2][10.200.20.39:9301][indices:data/write/index]"}],"type":"illegal_argument_exception","reason":"Document contains at least one immense term in field=\"field1\" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped.  Please correct the analyzer to not produce such terms.  The prefix of the first immense term is You can handle this: 1. you can udpate the mapping part at any time PUT INDEXNAME/_mapping/TYPENAME { "properties" : { "logInfo" : { "type" : "string" , "analyzer" : "keyword" , "ignore_above" : 32766 } } } ignore_above means that we will only keep 32766 bytes 2...

Scrapy ERROR :ImportError: Error loading object 'scrapy.telnet.TelnetConsole': No module named conch

原文: https://stackoverflow.com/questions/17263509/error-while-using-scrapy-scrapy-telnet-telnetconsole-no-module-named-conch/17264705#17264705 On Ubuntu, you should avoid using  easy_install  wherever you can. Instead, you should be using  apt-get ,  aptitude , "Ubuntu Software Center", or another of the distribution-provided tools. For example, this single command is all you need to install scrapy - along with every one of its dependencies that is not already installed: $ sudo apt - get install python - scrapy easy_install  is not nearly as good at installing things as  apt-get . Chances are the reason you can't get it to work is that it didn't quite install things sensibly, particularly with respect to what was already installed on the system. Sadly, it also leaves no record of what it did, so uninstallation is difficult or impossible. You may now have a big mess on your system that prevents proper installations from working as well (or maybe not...

Golang http server performance tuning practice (Golang HTTP服务器性能调优实践)

  Golang 1.8后本身自带了pprof的这个神器,可以帮助我们很方便的对服务做一个比较全面的profile。对于一个Golang的程序,可以从多个方面进行profile,比如memory和CPU两个最基本的指标,也是我们最关注的,同时对特有的goroutine也有运行状况profile。关于golang profiling本身就不做过多的说明,可以从 官方博客 中了解到详细的过程。   Profile的环境 Ubuntu 14.04.4 LTS (GNU/Linux 3.19.0-25-generic x86_64) go version go1.9.2 linux/amd64  profile的为一个elassticsearch数据导入接口,承担接受上游数据,根据元数据信息写入相应的es索引中。目前的状况是平均每秒是1.3Million的Doc数量。   在增加了profile后,从CPU层面发现几个问题。 runtime mallocgc 占用了17.96%的CPU。 SVG部分图如下 通过SVG图,可以看到调用链为: ioutil.ReadAll -> buffer.ReadFrom -> makeSlice -> malloc.go  然后进入ReadAll的源码。 readAll()方法 func readAll(r io.Reader, capacity int64) (b []byte, err error) { buf := bytes . NewBuffer ( make ([] byte , 0 , capacity )) // If the buffer overflows, we will get bytes.ErrTooLarge. // Return that as an error. Any other panic remains.   defer func() { e := recover () if e == nil { return } if panicErr , ok := e .( error ) ; ok && p...