我有一个5页的静态网站。在徽标上,有主页index.html
的链接。当我单击徽标时,它会将我重定向到主页,并将地址栏中的URL更改为www.mydomain.com/index.html
。
出于SEO的目的,我希望此URL保留为www.mydomain.com/
,而在URL末尾不包含index.html
。
我该如何实现?这是否需要.htaccess
中的任何规则或任何其他解决方案?
答案 0 :(得分:1)
只需将链接更改为指向/
而不是/index.html
。 Apache会将对/
的任何请求路由到合适的索引文件。
答案 1 :(得分:0)
/ *代替index.html使用“ /”,它将转到主页/ *
from ..items import DmoztutorialItem
import scrapy
class DmozSpiderSpider(scrapy.Spider):
name = 'Dmoz'
start_urls = ['http://dmoz-odp.org/']
about_page = 'http://dmoz-odp.org/docs/en/about.html'
editor = 'http://dmoz-odp.org/docs/en/help/become.html'
def parse(self, response):
# collect data on first page
items = {
'Navbar': response.css('#main-nav a::text').extract(),
'Category_names': response.css('.top-cat a::text').extract(),
'Subcategories': response.css('.sub-cat a::text').extract(),
'About_page': self.about_page,
'Become_an_editor': self.editor
}
# save and call request to another page
yield response.follow(self.about_page, self.parse_about, self.editor, self.parse_editor, meta={'items': items})
def parse_about(self, response):
# do your stuff on second page
items = response.meta['items']
items['Headings'] = response.css('h2::text , #mainContent h1::text').extract() # add your logics
items['Paragraphs'] = response.css('p::text').extract()
items['3 Projects'] = response.css('li~ li+ li b a::text , li:nth-child(1) b a::text').extract()
items['About Dmoz'] = response.css('.nav ul a::text , li:nth-child(2) b a::text').extract()
items['Languages'] = response.css('.nav~ .nav a::text').extract()
items['You can make a difference'] = response.css('dd::text , #about-contribute::text').extract()
items['Further information'] = response.css('li::text , #about-more-info a::text').extract()
yield items
def parse_editor(self, response):
# do your stuff on third page
editor_items = response.meta['items']
editor_items['Heading'] = response.css('#mainContent h1::text').extract()
yield editor_items