Azure OpenAI: Summarize an article

 

Images from https://www.pexels.com/@shantanu-kumar-433289388/
Images from https://www.pexels.com/@shantanu-kumar-433289388/

In this blog, we use OpenAI to summarize a document. The code is very succinct and it is easy for us to get a summary from different personas.

I have an Azure OpenAI service. Please get one if you wish to run the code. Or you can get any OpenAI service, and change AzureChatOpenAI in the code to your provider.

Dependencies

  • python = "^3.10"
  • bs4 = "^0.0.1"
  • openai = "^0.27.7"
  • langchain = "^0.0.189"
  • python-dotenv = "^1.0.0"

Environment Parameters

OPENAI_API_TYPE="azure"
OPENAI_API_BASE="<azure openai endpoint>"
OPENAI_API_KEY="<azure openai key>"
OPENAI_API_VERSION="<azure openai API version>"
I have the OPENAI_API_VERSION as 2023-03-15-preview

Source code

We use BeautifulSoup to extract text content from the URL. And, we have two prompt templates.
from urllib.request import urlopen
from bs4 import BeautifulSoup
from dotenv import load_dotenv

from langchain import LLMChain
from langchain.chat_models import AzureChatOpenAI
from langchain.prompts.chat import (
    ChatPromptTemplate,
    HumanMessagePromptTemplate,
    SystemMessagePromptTemplate,
)


def extract(url: str) -> str:
    html = urlopen(url).read()
    soup = BeautifulSoup(html, features="html.parser")

    for script in soup(["script", "style"]):
        script.extract()
    text = soup.get_text()

    lines = (line.strip() for line in text.splitlines())
    chunks = (phrase.strip() for line in lines for phrase in line.split("  "))
    text = "\n".join(chunk for chunk in chunks if chunk)
    return text


if __name__ == "__main__":
    load_dotenv()

    text = extract(
        "https://www.bbc.com/culture/article/20230529-the-people-who-dont-wash-their-clothes"
    )

    llm = AzureChatOpenAI(deployment_name="gpt-35-turbo")

    chat_prompts = ChatPromptTemplate.from_messages(
        [
            SystemMessagePromptTemplate.from_template(
                template="You are a {persona}, help to summarize a document."
            ),
            HumanMessagePromptTemplate.from_template('summarize: "{content}"'),
        ]
    )

    chain = LLMChain(llm=llm, prompt=chat_prompts)
    response = chain.generate([{"persona": "sustainability expert", "content": text}])

    print(response.generations[0][0].text)
    print(response.llm_output)

Output

The "no-wash" movement is gaining popularity as people are cutting down on their laundry habits due to environmental concerns, rising electricity costs, and convenience. Raw denim wearers, for example, avoid washing their jeans to achieve high-contrast patterns and softness. They use other ways to care for their garments, such as exposing them to UV rays or airing them overnight. Similarly, designer Stella McCartney caused headlines by detailing her low-clothes-cleaning habits. However, experts suggest that reducing the frequency of clothes washing is the right choice for the environment, but they don't advocate a complete washing machine moratorium. The best approach is to be flexible and wash things on lower temperatures or do a short refresh cycle without any washing powder at all, especially if the clothes don't smell. The movement is not about destroying the planet; it's about trying to get the balance right.
{
    'token_usage': {
        'completion_tokens': 177, 'prompt_tokens': 2434, 'total_tokens': 2611
     },
     'model_name': 'gpt-3.5-turbo'
}


When I change the persona to "doctor", I get a different summary.

The article discusses a growing movement of people who are washing their clothes less or not at all, citing reasons such as environmental concerns, rising electricity costs and aesthetic preferences. The article features interviews with individuals who have adopted this lifestyle, including a man who participates in a denim low-wash competition and a woman who wears the same dress every day for 100 days. While reducing laundry frequency is better for the environment and can save time, experts caution that washing clothes is important for medical and hygiene reasons. The article also suggests tips for washing clothes in the most effective way.

Observations

  1. LangChain makes it easy to initialize the LLM. It picks up the environment parameters automatically.
  2. LangChain makes it easy to create prompt templates.
  3. There is a lot of sample code on OpenAI.



Comments