awk choose between pattern.

Discussion in 'Programming/Scripts' started by tera7, Jan 25, 2009.

  1. tera7

    tera7 New Member

    Hello i have an output from linkex looks like this:

    a:27:{s:2:"id";i:1003;s:5:"added";i:1204908303;s:11:"lastchecked";i:1232545131;s:10:"laststatus";s:6:"200 OK";s:4:"lurl";s:22:"";s:4:"rurl";s:22:"";s:6:"anchor";s:15:"Linux downloads";s:11:"description";s:15:"Linux downloads";s:10:"categories";a:1:{i:0;s:4:"1001";}s:5:"email";s:15:"[email protected]";s:9:"skipcheck";i:0;s:12:"skippagerank";i:1;s:11:"minpagerank";i:3;s:6:"status";i:1;s:4:"wmip";s:12:"";s:4:"ldom";s:15:"";s:4:"rdom";s:15:"";s:4:"edom";s:9:"";s:6:"ldomip";s:14:"";s:6:"rdomip";s:14:"";s:6:"edomip";s:12:"unresolvable";s:8:"pagerank";i:3;s:12:"pagerankinfo";a:2:{s:4:"hash";s:32:"c5a53959bf9a17a282f3ef304ec713d8";s:4:"date";i:1232545570;}s:5:"notes";s:0:"";s:6:"weight";i:0;s:6:"addedf";s:10:"2008.03.07";s:12:"lastcheckedf";s:10:"2009.01.21";}

    and i want to much the text between http and "; so i can extract the url.

  2. ebal

    ebal New Member

    cut -d '"' -f14 php_session_file
  3. topdog

    topdog Active Member HowtoForge Supporter

    awk can do it too
    awk -F'"' '{ print $14 }' session_file
  4. tera7

    tera7 New Member

    Ok thanks i did something similar.

    grep http /location/linkex/data/links/* | awk -F 'rurl' '{print $2}' | awk -F '"' '{print $3}'
    Last edited: Jan 26, 2009
  5. ebal

    ebal New Member

    Another way and i believe and more efficiency is this:

    grep -Eo 'http://[a-z0-9]{1,}.[a-z0-9]{1,}.[a-z0-9]{1,}' /location/linkex/data/links/* | sort | uniq

    i would be nice to post the times between what you did and the above grep

    time grep -Eo 'http://[a-z0-9]{1,}.[a-z0-9]{1,}.[a-z0-9]{1,}' /location/linkex/data/links/* | sort | uniq
    time grep http /location/linkex/data/links/* | awk -F 'rurl' '{print $2}' | awk -F '"' '{print $3}'

Share This Page