awk choose between pattern.

Discussion in 'Programming/Scripts' started by tera7, Jan 25, 2009.

  1. tera7

    tera7 New Member

    Hello i have an output from linkex looks like this:

    a:27:{s:2:"id";i:1003;s:5:"added";i:1204908303;s:11:"lastchecked";i:1232545131;s:10:"laststatus";s:6:"200 OK";s:4:"lurl";s:22:"http://www.linux23.com";s:4:"rurl";s:22:"http://www.linux23.com";s:6:"anchor";s:15:"Linux downloads";s:11:"description";s:15:"Linux downloads";s:10:"categories";a:1:{i:0;s:4:"1001";}s:5:"email";s:15:"[email protected]";s:9:"skipcheck";i:0;s:12:"skippagerank";i:1;s:11:"minpagerank";i:3;s:6:"status";i:1;s:4:"wmip";s:12:"85.75.185.89";s:4:"ldom";s:15:"www.linux23.com";s:4:"rdom";s:15:"www.linux23.com";s:4:"edom";s:9:"otenet.gr";s:6:"ldomip";s:14:"89.185.228.200";s:6:"rdomip";s:14:"89.185.228.200";s:6:"edomip";s:12:"unresolvable";s:8:"pagerank";i:3;s:12:"pagerankinfo";a:2:{s:4:"hash";s:32:"c5a53959bf9a17a282f3ef304ec713d8";s:4:"date";i:1232545570;}s:5:"notes";s:0:"";s:6:"weight";i:0;s:6:"addedf";s:10:"2008.03.07";s:12:"lastcheckedf";s:10:"2009.01.21";}



    and i want to much the text between http and "; so i can extract the url.

    Thanks
     
  2. ebal

    ebal New Member

    cut -d '"' -f14 php_session_file
     
  3. topdog

    topdog Active Member HowtoForge Supporter

    awk can do it too
    Code:
    awk -F'"' '{ print $14 }' session_file
     
  4. tera7

    tera7 New Member

    Ok thanks i did something similar.

    grep http /location/linkex/data/links/* | awk -F 'rurl' '{print $2}' | awk -F '"' '{print $3}'
     
    Last edited: Jan 26, 2009
  5. ebal

    ebal New Member

    Another way and i believe and more efficiency is this:

    grep -Eo 'http://[a-z0-9]{1,}.[a-z0-9]{1,}.[a-z0-9]{1,}' /location/linkex/data/links/* | sort | uniq

    i would be nice to post the times between what you did and the above grep

    time grep -Eo 'http://[a-z0-9]{1,}.[a-z0-9]{1,}.[a-z0-9]{1,}' /location/linkex/data/links/* | sort | uniq
    time grep http /location/linkex/data/links/* | awk -F 'rurl' '{print $2}' | awk -F '"' '{print $3}'
     

Share This Page