Go web客户端进阶

由于上一节的 web 客户端相当简单并没有任何灵活性,在这节您将学习如何更优雅的读取一个 URL,不使用 http.Get() 函数并且没有更多选项。这个演示程序命名为 advancedWelClient.go,并分为五个部分展示。

advancedWebClient.go 的第一部份包含如下代码:

package main

import (
    "fmt"
    "net/http"
    "net/http/httputil"
    "net/url"
    "os"
    "path/filepath"
    "strings"
    "time"
)

advancedWebClient.go 的第二部分如下:

func main() {
    if len(os.Args) != 2 {
        fmt.Printf("Usage: %s URL\n", filepath.Base(os.Args[0]))
        return
    }
    URL, err := url.Parse(os.Args[1])
    if err != nil {
        fmt.Println("Error in parsing:", err)
        return
    }

advancedWebClient.go 的第三部分代码如下:

    c := &http.Client{
        Timeout: 15 * time.Second,
    }

    request, err := http.NewRequest("GET", URL.String(), nil)
    if err != nil {
        fmt.Println("Get:", err)
        return
    }
    httpData, err := c.Do(request)
    if err != nil {
        fmt.Println("Error in Do():", err)
        return
    }

http.NewRequest() 函数返回一个 http.Request 对象,它被赋予一个请求方法,一个 URL 和一个可选的消息体。http.Do() 函数使用 http.Client对象发送一个 HTTP 请求(htt.Request),并获得一个 HTTP 响应(http.Response)。http.Do() 以一种更易理解的方式做了 http.Get() 的工作。

http.NewRequest() 使用的 GET 字符串可以用 http.MethodGet 替换。

advancedWebClient.go 的第四部分包含代码如下:

    fmt.Println("Status code:", httpData.Status)
    header, _ := httputil.DumpResponse(httpData, false)
    fmt.Println(string(header))

    contentType := httpData.Header.Get("Content-Type")
    characterSet := strings.SplitAfter(contentType, "charset=")
    if len(characterSet) > 1 {
        fmt.Println("Character Set:", characterSet[1])
    }

    if httpData.ContentLength == -1 {
        fmt.Println("ContentLength is unknown!")
    } else {
        fmt.Println("ContentLength:", httpData.ContentLength)
    }

上面这段代码,您能看到如何开始搜索服务器响应来找到我们想要的。

advancedWebClient.go 的最后一部分如下:

    length := 0
    var buffer [1024]byte
    r := httpData.Body
    for {
        n, err := r.Read(buffer[0:])
        if err != nil {
            fmt.Println(err)
            break
        }
        length = length + n
    }
    fmt.Println("Calculated response data length:", length)
}

上面这段代码,您能看到一个计算服务器 HTTP 响应大小的技巧。如果您想显示这个 HTML 输出在您的屏幕上,您可以打印这个 r buffer 变量内容。

使用 advancedWebClient.go 访问一个 web 页面将产生如下比之前更丰富的输出:

$ go run advancedWebClient.go http://www.mtsoukalos.eu
Status code: 200 OK
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 0
Cache-Control: no-cache, must-revalidate
Connection: keep-alive
Content-Language: en
Content-Type: text/html; charset=utf-8
Date: Sat, 24 Mar 2018 18:52:17 GMT
Expires: Sun, 19 Nov 1978 05:00:00 GMT
Server: Apache/2.4.25 (Debian) PHP/5.6.33-0+deb8u1 mod_wsgi/4.5.11 Python/2.7
Vary: Accept-Encoding
Via: 1.1 varnish (Varnish/5.0)
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
X-Generator: Drupal 7 (http://drupal.org)
X-Powered-By: PHP/5.6.33-0+deb8u1
X-Varnish: 886025

Character Set: utf-8
ContentLength is unknown!
EOF
Calculated response data length: 50176

执行 advancedWebClient.go 访问一个不同的 URL 将返回一个稍有不同的输出:

$ go run advancedWebClient.go http://www.google.com
Status code: 200 OK
HTTP/1.1 200 OK
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-7
Date: Sat, 24 Mar 2018 18:52:38 GMT
Expires: -1
P3p: CP="This is not a P3P policy! See g.co/p3phelp for more info."
Server: gws
Set-Cookie: 1P_JAR=2018-03-24-18; expires=Mon, 23-Apr-2018 18:52:38 GMT; path=/;domain=.google.gr
Set-Cookie:
NID=126=csX1_koD30SJcC_1jAfcM2V8kTfRkppmAdmLjINLfclracMxuk6JGe4glc0Pjs8uD00bqGaxkSW-J-ZNDJexG2ZX9pNB9E_dRc2y1KZ05V7pk0boczE2FtS1zb50Uof1; expires=Sun, 23-Sep-2018 18:52:38 GMT; path=/; domain=.google.gr; HttpOnly
X-Frame-OPtions: SAMEORIGIN
X-Xss-Protection: 1; mode=block

Character Set: ISO-8859-7
ContentLength in unknown!
EOF
Calculated response data length: 10240

如果您使用 advancedWebClient.go 试图获取一个错误的 URL,将获得以下输出:

$ go run advancedWebClient.go http://www.google
Error in Do(): Get http://www.google: dial tcp: lookup www.google: no such host
$ go run advancedWebClient.go www.google.com
Error in Do(): Get www.google.com: unsupported protocol scheme ""

随意修改 advancedWebClient.go 以达到您想要的输出!

results matching ""

    No results matching ""